Question

all pairwise comparisons, some genes can have zero counts everywhere in a given comparison

1

Entering edit mode

anton.kratz ▴ 60

@antonkratz-8836

Last seen 23 days ago

Japan, Tokyo, The Systems Biology Insti…

I have 12 different biological states in triplicate and do all pairwise differential expression tests between these samples with DESeq2.

There can be any combination of two given states such that specific clusters (genes) have absolutely zero counts in all three replicates each, for this particular comparison.

For example, in raw counts it could look like this for a particular cluster:

stateA_rep1	stateA_rep2	stateA_rep2	stateB_rep1	stateB_rep2	stateB_rep3	stateC_rep1	stateC_rep 2	stateC_rep 3
0	0	0	46	34	67	0	0	0

The clusters for which this is true in any given comparison have differing baseMean values and can even have (very low) fold changes.

When rendering the result as a MA-plot, these clusters manifest themselves as horizontal "streaks" in the area of very low baseMean. In the example, when comparing stateA vs stateC I might get something like a log2FoldChange of 0.3.

I think I understand what is going on here: DESeq2 adds very low pseudocounts because the same clusters might have (and indeed, do) much higher counts in other samples (such as stateB) and thus can be properly compared (stateA vs stateB, stateC vs stateB), avoiding infinite fold changes.

However, I am wondering how to best deal with this in those specific comparsions where all counts are absolutely zero?

Somehow the result seems counterintuitive to me and difficult to defend, i.e. I have some clusters now with totally zero expression in a comparison, yet I get differetn baseMean values and even fold changes.

I think I cannot just remove those clusters from the original input because depending on the comparison in question, the clusters can have substanital counts in other samples. Also I am running DESeq2 on the entire table as I understand the recommendation from the vignette (https://bioconductor.org/packages/release/bioc/vignettes/DESeq2/inst/doc/DESeq2.html#if-i-have-multiple-groups-should-i-run-all-together-or-split-into-pairs-of-groups).

deseq2 • 984 views

ADD COMMENT • link updated 6.8 years ago by Michael Love 41k • written 6.8 years ago by anton.kratz ▴ 60

score 3 · Accepted Answer · 2017-07-19

3

Entering edit mode

Michael Love 41k

@mikelove

Last seen 3 hours ago

United States

The base mean is across all samples, so it makes sense it is not zero. Are you using lfcShrink? This will give you an LFC of ~0 for these.

ADD COMMENT • link 6.8 years ago Michael Love 41k

0

Entering edit mode

Thank you. I was not using lfcShrink, but now I am and this fixed my issue - I am on DESeq2 v 1.16.1 and was simply not aware that lfcShrink is not called implicitly anymore, now I am calling explicitly.

ADD REPLY • link 6.8 years ago anton.kratz ▴ 60