deseq2 LRT vs Wald test results for DE expression among 4 conditions
Entering edit mode
Last seen 6.8 years ago
Imperial College London


I have RNAseq data from 4 developmental conditions (let's say: 1, 2, 3, 4), 3 replicates each. I would like to find genes that are a) differentially expressed in stepwise comparisons but also b) genes that are enriched -and specific- in each stage (1,2,3,4).

For that, I've used deseq2 and after I input all the data together (to get normalisation across all samples),

a) I performed Wald test in pairwise comparisons 1vs2, 2vs3, 3vs4 and used the logFC and padj for stepwise differential expression and

b) I performed LRT test (ddsLRT <- DESeq(dds, test="LRT", reduced= ~ 1) because I don't have more variables e.g. treated vs untreated) and then I also performed Wald test for all the 6 different pairwise comparisons possible (1vs2, 1vs3, 1vs4, 2vs3, 2vs4, 3vs4). From the LRT results, 70%(!) of the genes have a padj<0.05 (I still have not understood how the logFC can be interpreted and used in this case - any input would be highly appreciated). What I find striking is that many of those genes do not show any statistically significant difference in any of the pairwise comparisons.

If I use the individual wald tests and the intersections of them to find e.g. genes enriched genes in condition 1 by doing: logFC>1, padj<0.05 in 1vs2 and 1vs3 and 1vs4, I also get genes that do not show a significant padj in the LRT test. 

I am struggling to find a way to actually apply the right statistical  to find the actual differentially expressed genes that are "enriched"/show a peak in one condition. Any ideas please??




rnaseq deseq2 lrt wald differential gene expression • 4.1k views
Entering edit mode
Gavin Kelly ▴ 690
Last seen 4.3 years ago
United Kingdom / London / Francis Crick…

I think this is due to the nature of the difference between LRT and Wald tests.  LRT is testing against all conditions being the same, so if the conditions are all slightly different, this could be enough for the LRT to reject the null, but none of the separate pairwise comparisons could reach significance.  Similarly, you could get three conditions all similar, but reasonably different from the 4th.  The 'average' effect would get diluted in the LRT, failing to reject the null there, but all nVs4 comparisons would be Wald-significant.

So the right statistical test is the one where the null and alternative hypotheses reflect the biology you're trying to capture.  If you're trying to capture all genes that are different between adjacent conditions, then the union of the three Wald lists is the better.  If you're trying to find genes that aren't constant across all conditions, the LRT is better.

The enrichment at condition X is trickier, as the power of RNASeq experiments is generally low, so you're tripling-up on your false-negatives when looking at the conjunction of three tests.  An alternative might be to pool the samples into two groups - condition 1 vs, condition not_1, which is a composite test of enrichment in condition 1 plus relative constancy amongst the other conditions.

Or you could just do an LRT, and then classify genes according to the observed pattern of effect sizes (ever-increasing fold-changes = stepwise; fold-changes involving lfc(condition2 vs conditionX) >0 and other lfc's <threshold = enriched at condition2, ...)


Entering edit mode

"you could just do an LRT, and then classify genes according to the observed pattern of effect sizes (ever-increasing fold-changes = stepwise;" how is the fold change calculated here since here is no reference if i understand. I have similar issue wald test is straight forward , but when I apply LRT reduced model Im not sure how this " If you're trying to find genes that aren't constant across all conditions" is occurring in the manual it was mentioned it find the difference between the full and reduced model to compare. So if there is a fold change how the fold change comes?


Login before adding your answer.

Traffic: 292 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6