Question

[DESeq2] time-course Experimental design

0

Entering edit mode

inzirio ▴ 10

@inzirio-13571

Last seen 5 months ago

Italy

I'd like to perform an experimental design for DESeq2 package using the LRT Test.

I've already done it before using two different conditions Treated (T) vs Untreated (U), each at different time points.

I was wondering how to design an experiment having the U replicates just at time point 0 and T replicated samples at different time-points.

Is it possible to perform it with DESeq2? I'm not able to find examples as in this special case, but i'm able to find several time-course data designed in this way.

Thanks,
Inzirio

rnaseq deseq2 time-course • 2.0k views

ADD COMMENT • link updated 6.8 years ago by Gavin Kelly ▴ 680 • written 6.8 years ago by inzirio ▴ 10

score 1 · Answer 1 · 2017-07-24

Sounds like you'd just need a single factor to model the experiment, where if U corresponds to 0hours, and e.g. T has 2hours, 4hours 16hours, then you'd simply label the replicates with a single factor, with levels 0h, 2h, 4h, 16h etc. You obviously won't be able to infer anything about how the untreated samples evolve over time, so you won't be able to remove such effects from the analysis, so all your conclusions will have to be worded with this in mind. You say you've previously done the other, more complete design, which to me seems to offer a safer approach to meaningful biological hypotheses, so I guess it's cost issues that are driving the question?

The only option for an LRT with one factor is to compare ~time against ~1, which will look for genes that reject the null of being constant across all 'timepoints' (ie are the same across all the treated samples, and the same as the untreated).

DESeq2 is more than capable of answering comparisons between pairs of timepoints, including the 0h (untreated) vs 2h (treated), for example - for this you'd need the Wald test, rather than the LRT. Just remember to code the timepoint term as a factor, as if you code it as a numeric, the comparisons will look for linear changes of (transformed) expression across time, which mightn't be what you want.

score 0 · Answer 2 · 2017-07-25

Still a bit confused about your use of LRT in the 'cheaper' design - it does not look at 'all... timepoints versus the 0h' - it's looking to see if the null of all timepoints being the same holds. So an LRT could feasibly come out as significant if the end timepoint was different from all previous timepoints, in which case one would probably interpret that finding not in terms of the untreated (0h) timepoint: there's nothing special about the 0hr in LRT. Similarly, there's not necessarily anything special about 0hr in Wald, as you can test 2hr vs 4hr just as easily as Untreated (0hr) vs 2hr ... As you say, diffrerent meanings of biological question, but neither LRT nor Wald will treat the untreated differently from other timepoints; nor will they 'know' that 4hr lies in between 2hr and 8hr, for instance.

My final point is probably something you're doing anyway, but I tend to put it in as warning to readers of answers that if, say, you encode `timepoint <- c(0,0,0,3,3,3,6,6,6)` and then put in a `~timepoint` in your design, then you'll only get one coefficient out, the significance of which is indicative of a linear trend in expression of time (so in this case, there is some 'knowledge' that 4 is in between 2 and 8). This alternative hypothesis is different from the one that most people expect, which is the pairwise difference between timepoints (achieved by `timepoint <- factor(timepoint)`)