Search
Question: DESeq2 1.16.1 : Zero count values re-occurrence of issue 56191?
0
gravatar for greg.deakin
5 months ago by
greg.deakin0 wrote:

Hello - I have an issue with genes (or OTUs in my case) which have zero expression across a condition.

Seems to be the same issue as

A: DESeq2; FoldChange values with only 0's

Works as expected (well as I'd expect) in DESeq2 1.10.1

Below are a few of the details

model = ~ Block + Treatment

contrast = c("Treatment","D","Control")

Treatment is a factor with 5 levels (Block has multiple levels, but I suspect that's irrelevant anyway)

Example of dodgy "gene"

aggregate(t(counts(dds["XXX"],normalize=T)),list(dds$Treatment),sum)

       Group.1   XXX
1      Control   0.0000
2       A           101.9302
3       B           130.5273
4       C           240.4031
5       D            0.0000

 

DESeq 1.16.1 (R 3.4) 

res["XXX",]

log2 fold change (MAP): Treatment D vs Control
Wald test p-value: Treatment D vs Control
DataFrame with 1 row and 6 columns

 

     baseMean log2FoldChange     lfcSE      stat       pvalue         padj
       <numeric>      <numeric> <numeric> <numeric>    <numeric>    <numeric>
XXX 3.152404       -20.7867  4.592323 -4.526401 5.999657e-06 0.0003431804

 

I get the same results if I set betaPrior to True

-------------------------------------------------------------------------

DESeq2 1.10.1 (R 3.2)

res["XXX",]

log2 fold change (MAP): Treatment D vs Control
Wald test p-value: Treatment ND vs Control
DataFrame with 1 row and 6 columns

        baseMean log2FoldChange     lfcSE      stat    pvalue      padj
       <numeric>      <numeric> <numeric> <numeric> <numeric> <numeric>
XXX  3.152404              0    1.3954         0         1         1

 

 

ADD COMMENTlink modified 5 months ago by Michael Love18k • written 5 months ago by greg.deakin0
2
gravatar for Michael Love
5 months ago by
Michael Love18k
United States
Michael Love18k wrote:

hi Greg,

I have this change documented in the NEWS file:

CHANGES IN VERSION 1.15.28
--------------------------

    o Remove some code that would "zero out" LFCs
      when both groups involved in a contrast had zero counts.
      This lead to inconsistency when similarly contrasts
      were performed by refactoring.

It presented more of a problem than it was worth, so I took out this code. The problems that we do see, occur when the data isn't well approximated by a NB, often it's not RNA-seq, and the Wald is more sensitive to this non-NB distribution than the LRT. I would recommend to use likelihood ratio tests, where you can provide a full matrix which is:

full <- model.matrix(design(dds), colData(dds))

And reduced can be constructed by removing a column (e.g. the D vs control difference).

You can do:

dds <- estimateSizeFactors(dds)
dds <- estimateDispersions(dds)

Then:

dds <- nbinomLRT(dds, full=full, reduced=reduced)
res <- results(dds)

I've seen this to provide robustness to e.g. large count outliers that are inconsistent with the NB model, and give expected results.

ADD COMMENTlink written 5 months ago by Michael Love18k

Ah, o.k. thanks Mike.

I guess it's pretty much an edge case, especially in RNA-seq, and not too difficult for me to correct for.

I'll have a look at using LRT. As you guess my data (mostly soil microbiom) might violate a few of the expectations of the DESeq model - it's a bit of an open question how to deal with data consisting mostly of zeros. Having said that, we've had pretty good results using the DESeq Wald test previously, but I can't remember having more than a two factor condition before.

Thanks for your help.

 

ADD REPLYlink written 5 months ago by greg.deakin0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 149 users visited in the last hour