Experiment design contrasts any time edgeR
2
1
Entering edit mode
Pietro ▴ 30
@pietro-13029
Last seen 9 weeks ago
Italy

Dear all,

I am working with a RNASeq data from an Arabidopsis experiment. The plants have been subjected to nitric oxide treatment for 3 hours and for 8 hours. I would like to find differentialy modulated genes not only in the classic between 2 comparison but also at any time in one test, following the section "3.3.3 Treatment effects over all times" of edgeR user guide.

The design of the experiment is 3 biological replicates for each of the 3 samples:

 Sample Condition Time untreated_1 ctrl 0 untreated_2 ctrl 0 untreated_3 ctrl 0 NO3_1 NO 3 NO3_2 NO 3 NO3_3 NO 3 NO8_1 NO 8 NO8_2 NO 8 NO8_3 NO 8

I managed to compare samples at 3 hours with control (3h vs ctrl) and samples at 8 the hour treatment against control (8h vs ctrl); now I'd like to make a whole comparison to find genes responding to the treatment any time in a single test.

I am using edgeR.

This is the code I am using:

targets <- is the table above
design <- model.matrix(~Time, data=targets)
fit <- glmFit(y, design)
lrt <- glmLRT(fit, coef=2)
topTags(lrt)

Coefficient:  Time
logFC   logCPM        LR        PValue           FDR
AT2G33270 1.3171504 4.287261 1136.0571 4.810643e-249 1.616472e-244
AT3G03270 0.9024100 8.335613 1116.2615 9.650825e-245 1.621435e-240
AT5G58840 1.1536560 4.831535 1108.5933 4.479132e-243 5.016927e-239
AT1G01460 1.8124852 3.853430 1079.3572 1.012809e-236 8.508099e-233
AT1G13480 1.3325911 5.177147 1062.6741 4.281367e-233 2.877250e-229
AT2G35730 1.1750776 5.391916  952.5724 3.660705e-209 2.050117e-205
AT4G02650 1.6227408 4.214342  913.3378 1.236992e-200 5.937916e-197
AT4G26470 0.9443015 4.350554  911.2471 3.522495e-200 1.479536e-196
AT2G02340 1.7263559 3.112108  890.8545 9.548953e-196 3.565155e-192
AT1G70440 1.6371184 5.024619  882.3160 6.857418e-194 2.304230e-190

Can you please tell me if this is correct (and in your opinion if it makes sense)?

Thanks a lot

Pietro

DGE rnaseq RNA edger • 722 views
3
Entering edit mode
Aaron Lun ★ 27k
@alun
Last seen 18 hours ago
The city by the bay

Your approach is okay, but assumes a linear response with respect to time (if Time is a real-valued covariate, which I presume it is by default). You have plenty of replicates, so I would rather use an ANODEV:

design <- model.matrix(~factor(Time), data=targets)
# ... other edgeR stuff...
lrt <- glmLRT(fit, coef=2:3)

The second coefficient of design will represent the log-fold change at 3 hours vs zero hours, while the third coefficient represents the log-fold change at 8 hours vs zero hours. When you drop both of them simultaneously in glmLRT, your null hypothesis is that both log-fold changes are equal to zero. A gene with any response at either time point will be more likely to reject the null hypothesis.

0
Entering edit mode

Thanks a lot, you were very clear!

Can you briefly explain to humans like me what's the difference between

design <- model.matrix(~Time, data=targets)
# ​and
design <- model.matrix(~factor(Time), data=targets)

?

What would be the equivalent commands to do this in DESeq2?

Thanks again

1
Entering edit mode

If Time is numeric, ~Time will assume that log-expression changes linearly with time, i.e., you're fitting a line of best fit to log-expression against time. With ~factor(Time), you're treating each time point as a separate group, so the log-expression isn't required to follow a linear relationship with respect to time.

DESeq2 also uses the formula notation so I presume it will work the same, but you'd have to ask the maintainers.