DESeq2 size factors change with fixed geometric means
1
0
Entering edit mode
Megatron • 0
@megatron-15960
Last seen 4.3 years ago

RNA-Seq count size factors are defined in formula 5 of Anders & Huber (2010)

With pre-specified geometric means, are size factors supposed to be the same for identical samples regardless of total count matrix context?

That is, if I calculate the size factor for a single sample or if I extract that size factor for that sample from a larger context, shouldn't they be identical if the geometric mean was fixed?

For example:

library(DESeq2)

set.seed(353567)

ddsRaw <- makeExampleDESeqDataSet(n=1000, m=40)
gm <- exp(rowMeans(log(counts(ddsRaw))))

dds <- estimateSizeFactors(ddsRaw, geoMeans=gm)

ddsSubset <- estimateSizeFactors(ddsRaw[, 10:20], geoMeans=gm)

all.equal(sizeFactors(dds)[10:20], sizeFactors(ddsSubset))  # Size factors are not equal

I think the code below from estimateSizeFactorsForMatrix() appears to be responsible for the dataset-dependent size factors, but I do not understand how it relates to formula 5, because it is now no longer solely dependent on the reference geometric means.

if (incomingGeoMeans) {
  sf <- sf/exp(mean(log(sf)))
}

Thanks!

size factors deseq2 • 1.2k views
ADD COMMENT
1
Entering edit mode
@mikelove
Last seen 23 hours ago
United States

The last chunk there is just to set size factors to have geometric mean of 1 for any particular dataset, regardless of their relation to geoMeans. I implemented it this way intentionally. So the normalization of a new dataset will be identical up to a single scaling factor regardless of the samples. What would your desired behavior be? I don't think it makes sense to have the size factors far from 1.

ADD COMMENT
0
Entering edit mode

Thanks for the quick reply!

I was under the impression that using an external geometric mean reference meant that size factors become context-independent. So you could normalize a single sample to a reference and get the same size factor.

ADD REPLY
0
Entering edit mode

You do get the same scaling across samples, up to a single global scaling. So the relative scaling between samples is fixed by fixing the geometric means.

ADD REPLY
0
Entering edit mode

Ok. Would it be possible to make this an option in a future release (ie. optionally disable relative scaling of size factors) or mention it somehow in the documentation? When the size factors for the same samples using reference are different it may not be apparent.

ADD REPLY
0
Entering edit mode

Sure, I've added this to the documentation:

The size factors will be scaled to have a geometric mean of 1 when supplying geoMeans.

ADD REPLY
0
Entering edit mode

Thanks for the explanation and modification!

ADD REPLY

Login before adding your answer.

Traffic: 988 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6