DESeq2: likelihood ratio test interpretation, multifactor
1
0
Entering edit mode
agdif • 0
@agdif-16034
Last seen 3.3 years ago

Hi Michael,

I would like to make sure that I am interpretating the p-values correctly when using the likelihood ratio test with a multifactor design.

We are interested in identifying significant genes that are differentially expressed between 3 disease states while controlling for sex. I understand that the likelihood ratio test is testing whether the reduced model (dispersions estimated on sex alone) or the full model (dispersions estimated on sex and condition) better fits the dataset.

So, if an adjusted p-value > 0.05 it indicates that those genes are better explained by the reduced model and are either affected by sex only or not at all. However if an adjusted p value < 0.05 , does it indicate the those genes are better explained / unique to disease state while controlling for sex OR better explained by disease state and sex combined?

(coldata = data.frame(row.names=colnames(countdata), disease, sex))
dds = DESeqDataSetFromMatrix(countData=countdata, colData=coldata, design=~sex+disease)
dds

# Run the DESeq pipeline
ddsLRT = DESeq(dds, test="LRT", full=~sex+disease, reduced=~sex)
res=results(ddsLRT)


likelihood ratio test deseq2 LRT • 3.9k views
2
Entering edit mode
@mikelove
Last seen 1 day ago
United States

"I understand that the likelihood ratio test is testing whether the reduced model (dispersions estimated on sex alone) or the full model (dispersions estimated on sex and condition) better fits the dataset."

Close, but not exactly. It tests whether the increase in the log likelihood from the additional coefficients would be expected if those coefficients were equal to zero. It doesn't mean the reduced model is a good model or a good fit.

The way to interpret the p-value is: if the adjusted p-value is small, then for the set of genes with those small adjusted p-values, the additional coefficient in full and not in reduced increased the log likelihood more than would be expected if their true value was zero.

See here for some more background:

https://en.wikipedia.org/wiki/Likelihood-ratio_test

0
Entering edit mode

Hi Michael Love, if I understood correctly, with the LRT test you are increasing the type-I error rate, and I am wondering if there is a multiple testing adjustment recommended for the LRT in DESeq2? Or there is another option that I can use?

I would also like to know the pval in different contrasts in a multifactor design with 1 factor of 4 groups, and you suggested in another post to perform a Wald test pairwise (DESeq2 ANODEV with 3 sample groups) using the function results with option test="Wald"), but that means it will overwrites the test="LRT" option in DESeq function?

Many thanks

1
Entering edit mode

"with the LRT test you are increasing the type-I error rate"

No I wouldn't say that.

Yes, this is part of the software. Please first read over the workflow:

https://bioconductor.org/packages/release/workflows/vignettes/rnaseqGene/inst/doc/rnaseqGene.html#multiple-testing

Yes, if you want to do pair-wise comparisons, you use test="Wald" in DESeq2. Yes it will overwrite the LRT p-values, as those do not represent pair-wise comparisons. See the LRT section of the vignette, or consult with a statistician familiar with linear models.

0
Entering edit mode

I understand that for both the test="Wald" and test="LRT", you already get an adjusted pval for multiple testing as stated in the vignettes. But, I would like to know if there is a way to get the pval, for the ANOVA-like LRT of the pairs you are contrasting in the LRT test (not only the LFC), and not the pval of all against all like ANOVA? Or the only solution is to perform a Wald test pairwise?

0
Entering edit mode

Yes, DESeq2 doesn’t provide pairwise LRT for all levels of the factor.