Search
Question: DESeq2 incorrect basemean
0
18 months ago by
asd0
asd0 wrote:

I have four genes in four rows in a matrix. The results function of the DESeq2 calculated the baseMean of the four row as

37.6, 2071.5, 1.0, 491.6

But if I calculate the means with rowMeans(counts(dds,normalized=TRUE)), then I get

45.2, 2679.8, 1.0, 491.6

Why is this happen?

And when I calculate the base mean per level with

baseMeanPerLvl <- sapply( levels(dds$condition), function(lvl) rowMeans( counts(dds,normalized=TRUE)[,dds$condition == lvl] ) )

which was posted on this forum by Michael Love, and I calculate the log2 fold change, then I get a different value then it is calculated by the results function.

ADD COMMENTlink
modified 18 months ago by Michael Love19k • written 18 months ago by asd0
3
18 months ago by
Michael Love19k
United States
Michael Love19k wrote:

The baseMean can be different than the mean of the normalized counts accessed by counts(dds) if extreme count outliers were replaced by DESeq(). The function would have printed a message about this to the console. The counts(dds) are the original counts, but the baseMean is over the counts with replaced outliers, see the Details in ?DESeq.

The baseMeanPerLvl is just that: mean of normalized counts *per level*, so it's a vector of numbers. The baseMean is a single number, the mean of normalized counts in the dds object.

ADD COMMENTlink written 18 months ago by Michael Love19k
Thank you for your answer. How can I calculate the base mean per level with replaced extreme count outliers?
ADD REPLYlink written 18 months ago by asd0
1

In ?DESeq we say:

"the replacement counts used for fitting are kept in assays(dds)[["replaceCounts"]]"

So you would use the same code as above but with this matrix instead:

normMat <- t(t(assays(dds)[["replaceCounts"]])/sizeFactors(dds))
baseMeanPerLvl <- sapply(levels(dds$condition), function(lvl) rowMeans(normMat[,dds$condition == lvl]))
ADD REPLYlink written 18 months ago by Michael Love19k

Why does the summary show that I have no outlier?

out of 4 with nonzero total read count
adjusted p-value < 0.1
LFC > 0 (up)     : 0, 0%
LFC < 0 (down)   : 3, 75%
outliers [1]     : 0, 0%
low counts [2]   : 0, 0%
(mean count < 1)
ADD REPLYlink written 18 months ago by asd0
1

DESeq() if appropriate replaces outliers, and so in the final results table, there are no outliers which would lead to filtering of genes. They are flagged in other ways, documented in the man pages.

ADD REPLYlink written 18 months ago by Michael Love19k
Please log in to add an answer.

Content
Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 202 users visited in the last hour