Question

Creating design and contrast matrices for DEG with Limma for 2 factors?

0

Entering edit mode

Nithisha ▴ 10

@nithisha-14272

Last seen 6.1 years ago

Hi everyone,

I have some data from a Microarray experiment for 70 samples and they are for 3 different treatments across 5 timepoints (1h/2h/8h/24h/48h). This is how my data looks like; the number of samples I have for each Treatment for each time point.

Time(hrs)/Treatment	Control	A	B	A+B
1	3	2	3	4
2	3	3	3	3
8	3	3	4	3
24	5	5	5	5
48	4	3	3	3

I want to be able to compare upregulation and downregulation of genes between the 4 treatment groups at different time points.

In such a case, should I first create 2 columns in my metadata called Timepoint and Treatment Type first?

And how would I then create the design and contrast matrix? I appreciate any advice on this, thanks!

Limma • 831 views

ADD COMMENT • link updated 6.5 years ago by Aaron Lun ★ 28k • written 6.5 years ago by Nithisha ▴ 10

score 2 · Accepted Answer · 2017-10-28

2

Entering edit mode

Aaron Lun ★ 28k

@alun

Last seen 3 hours ago

The city by the bay

You need two vectors; one specifying the time point for each sample, another specifying the treatment condition. Whether these are stored as columns in a data.frame or otherwise is irrelevant. Once you have these vectors, it is simple to construct the design matrix using a one-way layout following the advice in Section 9.5 of the limma user's guide. A contrast matrix can be similarly formulated based on the comparisons of interest.

ADD COMMENT • link 6.5 years ago Aaron Lun ★ 28k

0

Entering edit mode

Hello Aaron,

Thank you so much for your reply. Section 9.5 of the user guide was very useful indeed. However, I am a little confused about what this means in the guide.

"A list of top genes for RNA2 versus RNA1 can be obtained from

> topTable(fit2, coef=1, adjust="BH") "

Here, since sort.by is not specified, how is toptable ordering the results? (logFC/p value etc.)

Also, somehow my results for this seem to include the columns for my featureData together with the actual toptable output columns. If you would happen to know why, please do let me know.

Thank you for all your help!

ADD REPLY • link 6.5 years ago Nithisha ▴ 10

0

Entering edit mode

Some basic R knowledge would be helpful here. If you look at ?topTable, you will see that the default value for the sort.by argument is "B", i.e., the function will return genes sorted by the B-statistic (also known as the log-odds). The same documentation will reveal that the columns of fit$genes are added to the output table by default.