DESeq2: likelihood ratio test interpretation, multifactor
1
1
Entering edit mode
agdif ▴ 10
@agdif-16034
Last seen 5.3 years ago

Hi Michael,

I would like to make sure that I am interpretating the p-values correctly when using the likelihood ratio test with a multifactor design.

We are interested in identifying significant genes that are differentially expressed between 3 disease states while controlling for sex. I understand that the likelihood ratio test is testing whether the reduced model (dispersions estimated on sex alone) or the full model (dispersions estimated on sex and condition) better fits the dataset.

So, if an adjusted p-value > 0.05 it indicates that those genes are better explained by the reduced model and are either affected by sex only or not at all. However if an adjusted p value < 0.05 , does it indicate the those genes are better explained / unique to disease state while controlling for sex OR better explained by disease state and sex combined?

(coldata = data.frame(row.names=colnames(countdata), disease, sex))
dds = DESeqDataSetFromMatrix(countData=countdata, colData=coldata, design=~sex+disease)
dds

# Run the DESeq pipeline
ddsLRT = DESeq(dds, test="LRT", full=~sex+disease, reduced=~sex)
res=results(ddsLRT)
resSig=subset(res, padj<0.05)

 

likelihood ratio test deseq2 LRT • 7.6k views
ADD COMMENT
2
Entering edit mode
@mikelove
Last seen 12 hours ago
United States

"I understand that the likelihood ratio test is testing whether the reduced model (dispersions estimated on sex alone) or the full model (dispersions estimated on sex and condition) better fits the dataset."

Close, but not exactly. It tests whether the increase in the log likelihood from the additional coefficients would be expected if those coefficients were equal to zero. It doesn't mean the reduced model is a good model or a good fit.

The way to interpret the p-value is: if the adjusted p-value is small, then for the set of genes with those small adjusted p-values, the additional coefficient in full and not in reduced increased the log likelihood more than would be expected if their true value was zero.

See here for some more background:

https://en.wikipedia.org/wiki/Likelihood-ratio_test

ADD COMMENT
0
Entering edit mode

Hi Michael Love, if I understood correctly, with the LRT test you are increasing the type-I error rate, and I am wondering if there is a multiple testing adjustment recommended for the LRT in DESeq2? Or there is another option that I can use?

I would also like to know the pval in different contrasts in a multifactor design with 1 factor of 4 groups, and you suggested in another post to perform a Wald test pairwise (DESeq2 ANODEV with 3 sample groups) using the function results with option test="Wald"), but that means it will overwrites the test="LRT" option in DESeq function?

Many thanks

ADD REPLY
1
Entering edit mode

"with the LRT test you are increasing the type-I error rate"

No I wouldn't say that.

"is a multiple testing adjustment"

Yes, this is part of the software. Please first read over the workflow:

https://bioconductor.org/packages/release/workflows/vignettes/rnaseqGene/inst/doc/rnaseqGene.html#multiple-testing

Yes, if you want to do pair-wise comparisons, you use test="Wald" in DESeq2. Yes it will overwrite the LRT p-values, as those do not represent pair-wise comparisons. See the LRT section of the vignette, or consult with a statistician familiar with linear models.

ADD REPLY
0
Entering edit mode

I understand that for both the test="Wald" and test="LRT", you already get an adjusted pval for multiple testing as stated in the vignettes. But, I would like to know if there is a way to get the pval, for the ANOVA-like LRT of the pairs you are contrasting in the LRT test (not only the LFC), and not the pval of all against all like ANOVA? Or the only solution is to perform a Wald test pairwise?

ADD REPLY
0
Entering edit mode

Yes, DESeq2 doesn’t provide pairwise LRT for all levels of the factor.

ADD REPLY
0
Entering edit mode

Hi Michael,

What do you suggest please if I need to account for batch effect in the counts? should I include this in the reduced model formula ?

Thanks

ADD REPLY
0
Entering edit mode

Yes include in the reduced the variables you want to control for. See the quick start example in the vignette.

ADD REPLY

Login before adding your answer.

Traffic: 552 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6