Question: DESeq2 Error in estimateSizeFactorsForMatrix
2
gravatar for Guest User
5.0 years ago by
Guest User12k
Guest User12k wrote:
Hello Michael. I am a graduate student neuroscience researcher attempting to use the DESeq2 package to perform differential expression analysis of my sequencing data. I am following the beginner vignette but substituting my own data into the code. I managed to get to the part where it tells me to call the DESeq function, but I received the following error: > dds <- DESeq(ddsFull) estimating size factors Error in estimateSizeFactorsForMatrix(counts(object), locfunc, geoMeans = geoMeans) : every gene contains at least one zero, cannot compute log geometric means ----- My data was generated from FASTQ files from the sequencer, which I quality/adapter trimmed, and then aligned to our reference genome using the programs STAR and Bowtie2. The unmapped reads from the STAR program were subsequently run through Bowtie2 and the SAM file outputs from both alignment programs were combined using Picard-Tools MergeSAM. The merged SAM files were then converted to BAM files and I began the DESeq2 beginner tutorial. Could you please help me or direct me to a source where I might find a solution to my error problem? A "Google search" on the error did not return useful results. Thank you very much. Best, Caleb Bostwick -- output of sessionInfo(): R version 3.1.0 (2014-04-10) Platform: x86_64-w64-mingw32/x64 (64-bit) locale: [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 LC_MONETARY=English_United States.1252 LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] parallel stats graphics grDevices utils datasets methods base other attached packages: [1] DESeq2_1.4.5 RcppArmadillo_0.4.300.0 Rcpp_0.11.1 GenomicAlignments_1.0.1 BSgenome_1.32.0 Rsamtools_1.16.0 Biostrings_2.32.0 XVector_0.4.0 [9] GenomicFeatures_1.16.1 AnnotationDbi_1.26.0 Biobase_2.24.0 GenomicRanges_1.16.3 GenomeInfoDb_1.0.2 IRanges_1.22.7 BiocGenerics_0.10.0 BiocInstaller_1.14.2 loaded via a namespace (and not attached): [1] annotate_1.42.0 BatchJobs_1.2 BBmisc_1.6 BiocParallel_0.6.1 biomaRt_2.20.0 bitops_1.0-6 brew_1.0-6 codetools_0.2-8 DBI_0.2-7 digest_0.6.4 [11] fail_1.2 foreach_1.4.2 genefilter_1.46.1 geneplotter_1.42.0 grid_3.1.0 iterators_1.0.7 lattice_0.20-29 locfit_1.5-9.1 plyr_1.8.1 RColorBrewer_1.0-5 [21] RCurl_1.95-4.1 RSQLite_0.11.4 rtracklayer_1.24.1 sendmailR_1.1-2 splines_3.1.0 stats4_3.1.0 stringr_0.6.2 survival_2.37-7 tools_3.1.0 XML_3.98-1.1 [31] xtable_1.7-3 zlibbioc_1.10.0 -- Sent via the guest posting facility at bioconductor.org.
sequencing alignment deseq deseq2 • 4.0k views
ADD COMMENTlink modified 4.0 years ago by gaelgarcia0 • written 5.0 years ago by Guest User12k
Answer: DESeq2 Error in estimateSizeFactorsForMatrix
0
gravatar for Caleb Bostwick
5.0 years ago by
Caleb Bostwick30 wrote:
Hello Michael. I am a neuroscience researcher attempting to use the DESeq2 package to perform differential expression analysis of my sequencing data. I am following the beginner vignette but substituting my own data into the code. I managed to get to the part where it tells me to call the DESeq function, but I received the following error: > dds <- DESeq(ddsFull) estimating size factors Error in estimateSizeFactorsForMatrix(counts(object), locfunc, geoMeans = geoMeans) : every gene contains at least one zero, cannot compute log geometric means ----- My data was generated from FASTQ files from the sequencer, which I quality/adapter trimmed, and then aligned to our reference genome using the programs STAR and Bowtie2. The unmapped reads from the STAR program were subsequently run through Bowtie2 and the SAM file outputs from both alignment programs were combined using Picard-Tools MergeSAM. The merged SAM files were then converted to BAM files and I began the DESeq2 beginner tutorial. Could you please help me or direct me to a source where I might find a solution to my error problem? A "Google search" on the error did not return useful results. Thank you very much. Best, Caleb Bostwick The posting guide said I should include sessionInfo(): >sessionInfo() R version 3.1.0 (2014-04-10) Platform: x86_64-w64-mingw32/x64 (64-bit) locale: [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 LC_MONETARY=English_United States.1252 LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] parallel stats graphics grDevices utils datasets methods base other attached packages: [1] DESeq2_1.4.5 RcppArmadillo_0.4.300.0 Rcpp_0.11.1 GenomicAlignments_1.0.1 BSgenome_1.32.0 Rsamtools_1.16.0 Biostrings_2.32.0 XVector_0.4.0 [9] GenomicFeatures_1.16.1 AnnotationDbi_1.26.0 Biobase_2.24.0 GenomicRanges_1.16.3 GenomeInfoDb_1.0.2 IRanges_1.22.7 BiocGenerics_0.10.0 BiocInstaller_1.14.2 loaded via a namespace (and not attached): [1] annotate_1.42.0 BatchJobs_1.2 BBmisc_1.6 BiocParallel_0.6.1 biomaRt_2.20.0 bitops_1.0-6 brew_1.0-6 codetools_0.2-8 DBI_0.2-7 digest_0.6.4 [11] fail_1.2 foreach_1.4.2 genefilter_1.46.1 geneplotter_1.42.0 grid_3.1.0 iterators_1.0.7 lattice_0.20-29 locfit_1.5-9.1 plyr_1.8.1 RColorBrewer_1.0-5 [21] RCurl_1.95-4.1 RSQLite_0.11.4 rtracklayer_1.24.1 sendmailR_1.1-2 splines_3.1.0 stats4_3.1.0 stringr_0.6.2 survival_2.37-7 tools_3.1.0 XML_3.98-1.1 [31] xtable_1.7-3 zlibbioc_1.10.0 [[alternative HTML version deleted]]
ADD COMMENTlink written 5.0 years ago by Caleb Bostwick30
Answer: DESeq2 Error in estimateSizeFactorsForMatrix
0
gravatar for Simon Anders
5.0 years ago by
Simon Anders3.5k
Zentrum für Molekularbiologie, Universität Heidelberg
Simon Anders3.5k wrote:
Dear Caleb On 30/05/14 16:59, Caleb Bostwick [guest] wrote: > Could you please help me or direct me to a source where I might find > a solution to my error problem? A "Google search" on the error did > not return useful results. Thank you very much. For starters, check whether the claim of the error message is actualy true: > Error in estimateSizeFactorsForMatrix(counts(object), locfunc, geoMeans = geoMeans) : > every gene contains at least one zero, cannot compute log geometric means Does every gene contain a zero in at least one of the samples? If so, how comes? Simon
ADD COMMENTlink written 5.0 years ago by Simon Anders3.5k
Answer: DESeq2 Error in estimateSizeFactorsForMatrix
0
gravatar for gaelgarcia
4.0 years ago by
gaelgarcia0
UK
gaelgarcia0 wrote:

I have come across this problem as well. However, I don't understand why this is a problem... I have 96 samples, and each one of the 25,000 genes I estimated counts for is 0 in at least one of those samples. I don't see why this would cause the dispersion estimate to fail?

Thanks for your help.

ADD COMMENTlink written 4.0 years ago by gaelgarcia0

Because there is this snippet of code that needs to be run in the course of size factor estimation:

loggeomeans <- rowMeans(log(counts))

And if you have a count matrix where every gene has a 0 in at least one sample:

counts <- matrix(1:100, 10)
diag(counts) <- 0
loggeomeans <- rowMeans(log(counts))

Then every entry for loggeomeans will be infinite:

loggeomeans
[1] -Inf -Inf -Inf -Inf -Inf -Inf -Inf -Inf -Inf -Inf

... and you're hosed.

If you just want to do "something" and move on, you could try supplying a custom set of sizeFactors and not use DESeq2's default. A reasonable choice of alternate size factors might be calculated using edgeR's TMM method, ie:

sizeFactors(dds) <- calcNormFactors(counts(dds))
ADD REPLYlink modified 4.0 years ago • written 4.0 years ago by Steve Lianoglou12k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 137 users visited in the last hour