Search
Question: Creating design and contrast matrices for DEG with Limma for 2 factors?
0
8 months ago by
Nithisha10
Nithisha10 wrote:

Hi everyone,

I have some data from a Microarray experiment for 70 samples and they are for 3 different treatments across 5 timepoints (1h/2h/8h/24h/48h).  This is how my data looks like; the number of samples I have for each Treatment for each time point.

 Time(hrs)/Treatment Control A B A+B 1 3 2 3 4 2 3 3 3 3 8 3 3 4 3 24 5 5 5 5 48 4 3 3 3

I want to be able to compare upregulation and downregulation of genes between the 4 treatment groups at different time points.

In such a case, should I first create 2 columns in my metadata called Timepoint and Treatment Type first?

And how would I then create the design and contrast matrix? I appreciate any advice on this, thanks!

modified 8 months ago by Aaron Lun20k • written 8 months ago by Nithisha10
2
8 months ago by
Aaron Lun20k
Cambridge, United Kingdom
Aaron Lun20k wrote:

You need two vectors; one specifying the time point for each sample, another specifying the treatment condition. Whether these are stored as columns in a data.frame or otherwise is irrelevant. Once you have these vectors, it is simple to construct the design matrix using a one-way layout following the advice in Section 9.5 of the limma user's guide. A contrast matrix can be similarly formulated based on the comparisons of interest.

Hello Aaron,

Thank you so much for your reply. Section 9.5 of the user  guide was very useful indeed. However, I am a little confused about what this means in the guide.

"A list of top genes for RNA2 versus RNA1 can be obtained from

> topTable(fit2, coef=1, adjust="BH") "

Here, since sort.by is not specified, how  is toptable ordering the results? (logFC/p value etc.)

Also, somehow my results for this seem to include the columns for my featureData together with the actual toptable output columns. If you would happen to know why, please do let me know.

Thank you for all your help!

Some basic R knowledge would be helpful here. If you look at ?topTable, you will see that the default value for the sort.by argument is "B", i.e., the function will return genes sorted by the B-statistic (also known as the log-odds). The same documentation will reveal that the columns of fit\$genes are added to the output table by default.