Analyzing microbiome data (many zero-counts) using DESeq2
2
0
Entering edit mode
pjgalli2 • 0
@pjgalli2-11068
Last seen 5.1 years ago

I am trying to analyze microbiome data, which is count data with many zero-counts.  I receive this error when I run the DESeq function in DESeq2:

estimating size factors

Error in estimateSizeFactorsForMatrix(counts(object), locfunc = locfunc,  :

every gene contains at least one zero, cannot compute log geometric means

Is there a way to analyze this data without filtering out the features with very low counts?

DESeq2 microbiome zero-inflated • 1.1k views
3
Entering edit mode
@mikelove
Last seen 6 hours ago
United States

You can calculate an alternative size factor vector sf yourself and supply this like so, before calling DESeq():

stopifnot(all(sf > 0))
sf <- sf / exp(mean(log(sf))
sizeFactors(dds) <- sf

The second line is to make sure that the size factors are roughly centered around 1 (so that the mean of normalized counts is similar in scale to the mean of raw counts).

You could define your own size factor vector using some external software, or you could take e.g. an upper quantile of the columns of the count matrix.

Another option is to use type="iterate" in estimateSizeFactors(), which is an alternate size factor estimator we developed that does not require that there be rows without a zero.

I don't have any advice as to what is the best size factor estimator for microbiome data though. You could try the estimator implemented in metagenomeSeq:

http://www.cbcb.umd.edu/software/metagenomeSeq

0
Entering edit mode
@joseph-nathaniel-paulson-6442
Last seen 4.4 years ago
United States

What this error means is that there are no genes with at least 1 count for each OTU. Mike might have a work around for DESeq

1
Entering edit mode

I was meanwhile suggesting metagenomeSeq also :)