DESeq2 Error in varianceStabilizingTransformation
2
1
Entering edit mode
mfarhi ▴ 10
@mfarhi-6913
Last seen 9.5 years ago
United States

Dear Michael

This is a kind of a follow up of a previous post (https://support.bioconductor.org/p/59899/).

I am trying to use DESeq2 for counts data transformation. My data is coming from mapping to a de-novo assembled transcriptome so we therefore use eXpress for abundance estimation. When I try to run varianceStabilizingTransformation() or DESeq()received the error:

“Error in estimateSizeFactorsForMatrix(counts(object), locfunc, geoMeans = geoMeans) : every gene contains at least one zero, cannot compute log geometric means”

I can understand that a log transformation would be problematic if there are zeros but it makes sense that some genes, in some conditions, have zero counts estimated (i.e. are not expressed). Am I missing something or is there a workaround like the y=log2(y+ 1)?

 

Sincerely,

 

Moran   

 

R code

data <- read.table("matrix", header = T, row.names = 1, com= '')

data <- round(data)

data <- data[rowSums(data) >= 13,]

conditions = data.frame(conditions=factor(c(rep("inter ", 5),

                                                                             rep("late ", 6),

                                                                             rep("early ", 10),

                                                                             rep("flo", 4),

                                                                             rep("se", 4), )))

rownames(conditions) <- colnames(data)

ddsTable <- DESeqDataSetFromMatrix(countData = data,

                                   colData = conditions,

                                   design = ~ conditions)

vsd <- varianceStabilizingTransformation(ddsTable)

 

> as.data.frame(colData(ddsTable))

                                                conditions

X10DP            inter

X10DP           inter

X10DP            inter

X10DP            inter

X10DP            inter

X21DP             late

X21DP             late

X21DP             late

X21DP             late

X21DP             late

X21DP             late

X4DP             early

X4DP             early

X4DP             early

X4DP             early

X4DP             early

X4DP             early

X4DP             early

flo_1           flo

flo_2           flo

flo_3           flo

flo_4           flo

se_1               se

se_2               se

se_3               se

se_4               se

 

> sessionInfo()

R version 3.1.1 (2014-07-10)

Platform: x86_64-apple-darwin13.1.0 (64-bit)

locale:

[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:

[1] parallel  stats     graphics  grDevices utils     datasets  methods   base    

other attached packages:

 [1] BiocParallel_0.6.1        BiocInstaller_1.14.3      DESeq2_1.4.5            

 [4] RcppArmadillo_0.4.450.1.0 Rcpp_0.11.3               GenomicRanges_1.16.4     

 [7] GenomeInfoDb_1.0.2        IRanges_1.22.10           BiocGenerics_0.10.0     

[10] ggplot2_1.0.0             edgeR_3.6.8               limma_3.20.9            

loaded via a namespace (and not attached):

 [1] annotate_1.42.1      AnnotationDbi_1.26.1 base64enc_0.1-2      BatchJobs_1.4      

 [5] BBmisc_1.7           Biobase_2.24.0       brew_1.0-6           checkmate_1.5.0    

 [9] codetools_0.2-9      colorspace_1.2-4     DBI_0.3.1            digest_0.6.4       

[13] evaluate_0.5.5       fail_1.2             foreach_1.4.2        formatR_1.0        

[17] genefilter_1.46.1    geneplotter_1.42.0   grid_3.1.1           gtable_0.1.2       

[21] iterators_1.0.7      knitr_1.7            lattice_0.20-29      locfit_1.5-9.1     

[25] MASS_7.3-35          munsell_0.4.2        plyr_1.8.1           proto_0.3-10       

[29] RColorBrewer_1.0-5   reshape2_1.4         RSQLite_0.11.4       scales_0.2.4       

[33] sendmailR_1.2-1      splines_3.1.1        stats4_3.1.1         stringr_0.6.2      

[37] survival_2.37-7      tools_3.1.1          XML_3.98-1.1         xtable_1.7-4       

[41] XVector_0.4.0

deseq2 • 8.6k views
ADD COMMENT
0
Entering edit mode

hi Moran, I forgot to exponentiate the mean of log counts to get the geometric mean. I've now fixed the code in the answer below.

ADD REPLY
0
Entering edit mode

Hi,

I'm very new to DESeq, so I'm not sure if my comments will be of any use, but I got this problem


Error in estimateSizeFactorsForMatrix(counts(object), locfunc = locfunc,  :

  every gene contains at least one zero, cannot compute log geometric means


when I had samples in rows and reads (amplicons, as I work on microbial communities) in columns. After transposing the data matrix the Error was gone.

Cheers,

Kasia

ADD REPLY
0
Entering edit mode

Try using:

dds <- estimateSizeFactors(dds, type="poscounts")

And if you have an error, post a new question, and supply all of your R code and sessionInfo()

ADD REPLY
2
Entering edit mode
@mikelove
Last seen 2 minutes ago
United States

hi Moran,

The median ratio method defined in the first DESeq paper requires calculation of the geometric mean of a row, which is 0 if a single count has a 0, and then taking the ratio of a sample to the geometric mean.

I would first filter the dataset to remove rows with very few counts (with x set to 5 say):

dds <- dds[ rowSums(counts(dds)) > x, ]

Then one alternative estimator I have used when there are no rows without a zero is:

cts <- counts(dds)
geoMeans <- apply(cts, 1, function(row) if (all(row == 0)) 0 else exp(mean(log(row[row != 0]))))
dds <- estimateSizeFactors(dds, geoMeans=geoMeans)

 

ADD COMMENT
1
Entering edit mode

Better is to use exp(sum(log(row[row != 0]))/length(row)). This came up through evolving use cases in the phyloseq package by Paul (Joey) McMurdie. I'm now adding this alternate estimator as an official "type" to the estimateSizeFactors function, so that it can be documented.

ADD REPLY
0
Entering edit mode
mfarhi ▴ 10
@mfarhi-6913
Last seen 9.5 years ago
United States

Hi Michael

Thank you very much for the reply, and fix. I do not have access to my data now but will try give a try next week when I'm back

Moran 

ADD COMMENT

Login before adding your answer.

Traffic: 752 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6