Bias of lowly expressed genes in DESeq2
1
0
Entering edit mode
Jason • 0
@f0b26999
Last seen 18 months ago
Switzerland

Hi Michael,

I have treated vs untreated(wt) samples. And I know a subset of genes are very lowly expressed in wt but will be up-regulated in treated samples. When I do the DEseq2 analysis, most of them are at the top if I rank them by adjust p values or by fold change which makes sense. But in this case, it looks like the genes ranking at the top (high pvalue or foldchange) will bias to genes lowly expressed in wt. Their baseMeans are intermediate since they consider all the samples (treated+wt). Thus I think shrinkage method will also not help if it is relative to baseMeans. So I wonder how DESeq2 deals with such bias??

Thank you in advance for your answer.

DESeq2 • 922 views
ADD COMMENT
0
Entering edit mode
ATpoint ★ 4.0k
@atpoint-13662
Last seen 16 hours ago
Germany

Their baseMeans are intermediate since they consider all the samples (treated+wt)

If a gene is decently expressed in conditionA and almost shut down in conditionB, wouldn't you exactly expect an intermediate baseMean? And if this scenario is true, wouldn't you also expect that these are then the most significant changes, both in terms of effect size and significance?

I guess it would be good to add some details, such as the counts, and results output for this gene, and where it is on the MA-plot.

ADD COMMENT
0
Entering edit mode

Hi ATpoint,

Thanks for your reply. Yes, I expect an intermedia baseMean. That is why I think shrinkage probably not helps much (correct me if I am wrong). And I expect them to be the most significant ones too but since the fold change X/Y is anti-correlated with Y (spurious correlation), I am worried that the high fold change they have is only due to the super small Y. How much should I trust them at the top list over the other significant genes if I want to rank all the significant genes?

ADD REPLY
0
Entering edit mode

You really should to show some data, I doubt that this can be answered based on textual descriptions.

ADD REPLY
0
Entering edit mode

Just a note:

The LFC shrinkage does not depend on the baseMean. It just uses the counts and the adaptive prior for LFC (looking across all genes). Unlike for dispersion estimation, our prior is experiment-wide for LFC, not specific to the gene's baseMean.

ADD REPLY
0
Entering edit mode

Hi Michael,

You said "uses the counts". Here the "counts" means the counts from (treated + wt) or just wt? How does the DEseq2 deal with the genes with very low counts only in wt but not in treated? Thx.

ADD REPLY
0
Entering edit mode

All counts, not specifically from one group.

ADD REPLY

Login before adding your answer.

Traffic: 597 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6