What's the criteria used by DESeq2 to calculate log2 fold changes in time-course experiments?
1
0
Entering edit mode
@santosmigueldm-14213
Last seen 3.9 years ago

Hi all,

I was following the protocol for analysis of Time course experiments using DESeq2 (http://www.bioconductor.org/help/workflows/rnaseqGene/#time-course-experiments) and came up with a doubt regarding the criteria used to calculate the fold-changes and padj.

In this example the fission data package was used, which contains RNA-seq data for a time-course of fission yeast from two different strains.

The following code performs a likelihood ratio test. Genes with small p value from this test are those which at one or more time points after time 0 showed a strain-specific effect.

library("fission")
data("fission")
ddsTC <- DESeqDataSet(fission, ~ strain + minute + strain:minute)
ddsTC <- DESeq(ddsTC, test="LRT", reduced = ~ strain + minute)
resTC <- results(ddsTC)

My question is: When you look to the calculated fold-changes, are they the average of the fold-changes for all the time-points, or if a particular gene is affected only in one time-point, DESeq calculates the fold-change for that particular time-point?

Also, regarding the padj, how does DESeq rank the genes? If a gene X has a 1.3 fold-change across all the time-points and another gene Y a 2 fold-change difference in only one or two time-points, which one will be more significant?

Finally, is there a way of knowing in how many and which time-points the differences are significant?

Thank You!

deseq2 timecourse • 1.2k views
2
Entering edit mode
Gavin Kelly ▴ 660
@gavin-kelly-6944
Last seen 2.8 years ago
United Kingdom / London / Francis Crick…

Assuming that 'minute' is a factor, then an LRT is looking across all levels of minute, and the null hypothesis is that all the interactions are zero (so that the difference between strains is the same as it is at minute 'zero', or whatever the baseline is).  It reports a fold-change, but that is (according to the help page):

Which log2 fold change is printed in the results table can be controlled using the 'name' argument, or by default this will be the estimated coefficient for the last element of'resultsNames(object)'

To examine pairwise comparisons, you'll need to do a Wald test to get individual effect sizes at different timepoints, and you can then read off which are larger/smaller/different signs (and if you want, which are significant at each specific timepoint).

If you've cast 'minute' as a numeric, then it's a linear regression, and the fold-change represents the expected change in log-expression for each minute that passes.

0
Entering edit mode

Thanks Gavin!

That was truly helpful!

0
Entering edit mode

I didn't post here since it would lead to cross posting i have a query regarding the deseq2 design and the output here this post It would be really helpful if you could give your insight and I would be really glad.