Question: DESeq2 incorrect basemean
0
21 months ago by
asd0
asd0 wrote:

I have four genes in four rows in a matrix. The results function of the DESeq2 calculated the baseMean of the four row as

37.6, 2071.5, 1.0, 491.6

But if I calculate the means with rowMeans(counts(dds,normalized=TRUE)), then I get

45.2, 2679.8, 1.0, 491.6

Why is this happen?

And when I calculate the base mean per level with

baseMeanPerLvl <- sapply( levels(dds$condition), function(lvl) rowMeans( counts(dds,normalized=TRUE)[,dds$condition == lvl] ) )

which was posted on this forum by Michael Love, and I calculate the log2 fold change, then I get a different value then it is calculated by the results function.

deseq2 mean • 747 views
modified 21 months ago by Michael Love21k • written 21 months ago by asd0
3
21 months ago by
Michael Love21k
United States
Michael Love21k wrote:

The baseMean can be different than the mean of the normalized counts accessed by counts(dds) if extreme count outliers were replaced by DESeq(). The function would have printed a message about this to the console. The counts(dds) are the original counts, but the baseMean is over the counts with replaced outliers, see the Details in ?DESeq.

The baseMeanPerLvl is just that: mean of normalized counts *per level*, so it's a vector of numbers. The baseMean is a single number, the mean of normalized counts in the dds object.

Thank you for your answer. How can I calculate the base mean per level with replaced extreme count outliers?
1

In ?DESeq we say:

"the replacement counts used for fitting are kept in assays(dds)[["replaceCounts"]]"

So you would use the same code as above but with this matrix instead:

normMat <- t(t(assays(dds)[["replaceCounts"]])/sizeFactors(dds))
baseMeanPerLvl <- sapply(levels(dds$condition), function(lvl) rowMeans(normMat[,dds$condition == lvl]))

Why does the summary show that I have no outlier?

out of 4 with nonzero total read count
LFC > 0 (up)     : 0, 0%
LFC < 0 (down)   : 3, 75%
outliers [1]     : 0, 0%
low counts [2]   : 0, 0%
(mean count < 1)
1

DESeq() if appropriate replaces outliers, and so in the final results table, there are no outliers which would lead to filtering of genes. They are flagged in other ways, documented in the man pages.