Search
Question: How does DESeq2 handle zero counts in one condition?
0
gravatar for gavrielmatt
2.9 years ago by
United States
gavrielmatt0 wrote:

I asked this question on SeqAnswers.

Does anyone know how DESeq2 handles genes that have zero counts in one condition and >0 counts in another?

My results output shows that these genes have a positive log2 fold-change value, but I do not understand how DESeq2 arrives at this number if it is taking the log of a ratio, in which the numerator is divided by zero.

Count data (letters are conditions; numbers are replicates):
 

  A1 A2 B1 B2
Gene 1 0 0 692 1350

 

> dds <- DESeqDataSetFromMatrix(countData = countData, colData = colData, design = ~ condition)
> dds$condition <- factor(dds$condition, levels=c("A","B"))
> dds <- DESeq(dds)
> res<-results(dds,independentFiltering = F)

 

Results table:

 

  baseMean  log2FoldChange  lfcSE        stat  pvalue  padj
Gene1 585.1394 11.78521 1.458913 8.078076 6.579644e-16 3.884571e-15


I know that DESeq1 gave an 'Inf' value in these cases, but how does DESeq2 arrive at a real number value? 

Thanks

ADD COMMENTlink modified 2.9 years ago • written 2.9 years ago by gavrielmatt0
2
gravatar for Michael Love
2.9 years ago by
Michael Love14k
United States
Michael Love14k wrote:

While the maximum likelihood estimate (MLE) of DESeq goes to Inf, the use of a prior distribution on LFCs (log fold changes) in DESeq2 gives us a finite estimate. The way to interpret this is that: zeros might indicate absolute no fragments in samples of A, or more likely that the expected counts of fragments is some positive value below 1. If we were to increase the sequencing depth by 10 or 100, etc., we might start to observe some fragments in A. The prior distribution for LFCs is estimated by looking at the distribution of MLE fold changes observed, including other genes where the sequencing depth is higher, and using this range to give a finite estimate here. (See our paper for full details http://genomebiology.com/2014/15/12/550/abstract .) So the estimate here depends on: the dispersion for this gene, how large the counts are for B, and the distribution of log fold changes for other genes which had finite MLE LFCs.

ADD COMMENTlink modified 2.9 years ago • written 2.9 years ago by Michael Love14k

"some positive value below 0" You probably meant "above 0" :)

ADD REPLYlink written 2.9 years ago by Peter Langfelder1.3k

yes. below 1. those positive values below 0 are hard to find.

ADD REPLYlink written 2.9 years ago by Michael Love14k

does the same hold when using betaprior=FALSE? thanks

ADD REPLYlink modified 18 months ago • written 18 months ago by deut1350

When betaPrior=FALSE, the MLE LFC goes to infinity, but in practice we don't wait for it to reach infinity, but we stop at a large value, a fold change of exp(30). 

ADD REPLYlink written 18 months ago by Michael Love14k

genes with for example all zeros in control condition (the denominator), don't get the same MLE FC, so I guess this depends also on the values of the numerator?

 

ADD REPLYlink written 18 months ago by deut1350

It's a stopping rule, I don't then set the values to have a natural log fold change of 30. Also, things are limited from going to Infinity because I have a bound on the expected value for mu, which acts similar to a pseudocount (but it's not a pseudocount approach).

ADD REPLYlink written 18 months ago by Michael Love14k
0
gravatar for gavrielmatt
2.9 years ago by
United States
gavrielmatt0 wrote:

Thank you for the explanation Michael.

The link you posted is dead, but I assume you mean this paper?

Love, M. I., Huber, W., & Anders, S. (2014). Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biology, 15(12), 550. doi:10.1186/s13059-014-0550-8

ADD COMMENTlink written 2.9 years ago by gavrielmatt0
0
gravatar for gavrielmatt
2.9 years ago by
United States
gavrielmatt0 wrote:

Ah. It's because you included the period and parantheses in the hyperlink.

ADD COMMENTlink written 2.9 years ago by gavrielmatt0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 113 users visited in the last hour