Question: Analyzing microbiome data (many zero-counts) using DESeq2
0
3.0 years ago by
pjgalli20
pjgalli20 wrote:

I am trying to analyze microbiome data, which is count data with many zero-counts.  I receive this error when I run the DESeq function in DESeq2:

estimating size factors

Error in estimateSizeFactorsForMatrix(counts(object), locfunc = locfunc,  :

every gene contains at least one zero, cannot compute log geometric means

Is there a way to analyze this data without filtering out the features with very low counts?

modified 3.0 years ago by Michael Love24k • written 3.0 years ago by pjgalli20
Answer: Analyzing microbiome data (many zero-counts) using DESeq2
3
3.0 years ago by
Michael Love24k
United States
Michael Love24k wrote:

You can calculate an alternative size factor vector sf yourself and supply this like so, before calling DESeq():

stopifnot(all(sf > 0))
sf <- sf / exp(mean(log(sf))
sizeFactors(dds) <- sf

The second line is to make sure that the size factors are roughly centered around 1 (so that the mean of normalized counts is similar in scale to the mean of raw counts).

You could define your own size factor vector using some external software, or you could take e.g. an upper quantile of the columns of the count matrix.

Another option is to use type="iterate" in estimateSizeFactors(), which is an alternate size factor estimator we developed that does not require that there be rows without a zero.

I don't have any advice as to what is the best size factor estimator for microbiome data though. You could try the estimator implemented in metagenomeSeq:

http://www.cbcb.umd.edu/software/metagenomeSeq

Answer: Analyzing microbiome data (many zero-counts) using DESeq2
0
3.0 years ago by
United States
Joseph Nathaniel Paulson270 wrote:

What this error means is that there are no genes with at least 1 count for each OTU. Mike might have a work around for DESeq

1

I was meanwhile suggesting metagenomeSeq also :)