I have 12 different biological states in triplicate and do all pairwise differential expression tests between these samples with DESeq2.
There can be any combination of two given states such that specific clusters (genes) have absolutely zero counts in all three replicates each, for this particular comparison.
For example, in raw counts it could look like this for a particular cluster:
The clusters for which this is true in any given comparison have differing baseMean values and can even have (very low) fold changes.
When rendering the result as a MA-plot, these clusters manifest themselves as horizontal "streaks" in the area of very low baseMean. In the example, when comparing stateA vs stateC I might get something like a log2FoldChange of 0.3.
I think I understand what is going on here: DESeq2 adds very low pseudocounts because the same clusters might have (and indeed, do) much higher counts in other samples (such as stateB) and thus can be properly compared (stateA vs stateB, stateC vs stateB), avoiding infinite fold changes.
However, I am wondering how to best deal with this in those specific comparsions where all counts are absolutely zero?
Somehow the result seems counterintuitive to me and difficult to defend, i.e. I have some clusters now with totally zero expression in a comparison, yet I get differetn baseMean values and even fold changes.
I think I cannot just remove those clusters from the original input because depending on the comparison in question, the clusters can have substanital counts in other samples. Also I am running DESeq2 on the entire table as I understand the recommendation from the vignette (https://bioconductor.org/packages/release/bioc/vignettes/DESeq2/inst/doc/DESeq2.html#if-i-have-multiple-groups-should-i-run-all-together-or-split-into-pairs-of-groups).