Analyzing microbiome data (many zero-counts) using DESeq2
Entering edit mode
pjgalli2 • 0
Last seen 5.1 years ago

I am trying to analyze microbiome data, which is count data with many zero-counts.  I receive this error when I run the DESeq function in DESeq2:

estimating size factors

Error in estimateSizeFactorsForMatrix(counts(object), locfunc = locfunc,  : 

  every gene contains at least one zero, cannot compute log geometric means

Is there a way to analyze this data without filtering out the features with very low counts?


DESeq2 microbiome zero-inflated • 1.1k views
Entering edit mode
Last seen 6 hours ago
United States

You can calculate an alternative size factor vector sf yourself and supply this like so, before calling DESeq():

stopifnot(all(sf > 0))
sf <- sf / exp(mean(log(sf))
sizeFactors(dds) <- sf

The second line is to make sure that the size factors are roughly centered around 1 (so that the mean of normalized counts is similar in scale to the mean of raw counts).

You could define your own size factor vector using some external software, or you could take e.g. an upper quantile of the columns of the count matrix.

Another option is to use type="iterate" in estimateSizeFactors(), which is an alternate size factor estimator we developed that does not require that there be rows without a zero. 

I don't have any advice as to what is the best size factor estimator for microbiome data though. You could try the estimator implemented in metagenomeSeq:

Entering edit mode
Last seen 4.4 years ago
United States

What this error means is that there are no genes with at least 1 count for each OTU. Mike might have a work around for DESeq

Entering edit mode

I was meanwhile suggesting metagenomeSeq also :)


Login before adding your answer.

Traffic: 415 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6