DESeq2 error in evaluating the argument 'args' in selecting a method for function 'do.call'
1
0
Entering edit mode
frankS • 0
@franks-8744
Last seen 8.7 years ago
United Kingdom

I have just upgraded my R to 3.2, bioconductor to 3.1 and then installed the latest DESeq2 package and now a DESeq analysis that used to work keeps failing with the same error. Here is what I have been doing:

> register(MulticoreParam(12))
> countData<-read.csv("counts.csv",header=T,row.names=1)
> colData<-read.csv("sample_meta_data.csv", header=T, row.names=1)
> dds<-DESeqDataSetFromMatrix(countData = countData, colData=colData, design = ~ line + treatment  )
# remove some genes with low counts
> dds <- estimateSizeFactors(dds)
> nc<-counts(dds, normalized=TRUE)
> filter<-rowSums(nc >= 15) >= 3
> ddsf<-dds[filter,]
# run DESeq2
>  ddsf<-DESeq(ddsf, parallel=T)
using pre-existing size factors
estimating dispersions
gene-wise dispersion estimates: 12 workers
 
Error in do.call(rbind, bplapply(levels(idx), function(l) { :
  error in evaluating the argument 'args' in selecting a method for function 'do.call':
> Terminated

It takes quite a while for the error to appear (10 minutes or so?) after the message "gene-wise dispersion estimates" appears.

I have re-installed all previously installed CRAN packages now and also re-installed the BioConductor packages and nothing is showing any errors there.

Does anybody else experience problems with this latest release of DESeq2?

here is my sessionInfo:

> sessionInfo()
R version 3.2.2 (2015-08-14)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu precise (12.04.5 LTS)

locale:
 [1] LC_CTYPE=en_GB.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=en_GB.UTF-8        LC_COLLATE=en_GB.UTF-8
 [5] LC_MONETARY=en_GB.UTF-8    LC_MESSAGES=en_GB.UTF-8
 [7] LC_PAPER=en_GB.UTF-8       LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets
[8] methods   base

other attached packages:
 [1] org.Mm.eg.db_3.1.2        RSQLite_1.0.0
 [3] DBI_0.3.1                 AnnotationDbi_1.30.1
 [5] Biobase_2.28.0            plyr_1.8.3
 [7] BiocParallel_1.2.20       gplots_2.17.0
 [9] DESeq2_1.8.1              RcppArmadillo_0.5.400.2.0
[11] Rcpp_0.12.0               GenomicRanges_1.20.6
[13] GenomeInfoDb_1.4.2        IRanges_2.2.7
[15] S4Vectors_0.6.5           BiocGenerics_0.14.0
[17] RColorBrewer_1.1-2        BiocInstaller_1.18.4

loaded via a namespace (and not attached):
 [1] futile.logger_1.4.1  XVector_0.8.0        bitops_1.0-6
 [4] futile.options_1.0.0 tools_3.2.2          rpart_4.1-10
 [7] digest_0.6.8         annotate_1.46.1      gtable_0.1.2
[10] lattice_0.20-33      proto_0.3-10         gridExtra_2.0.0
[13] genefilter_1.50.0    stringr_1.0.0        cluster_2.0.3
[16] caTools_1.17.1       gtools_3.5.0         locfit_1.5-9.1
[19] nnet_7.3-11          grid_3.2.2           XML_3.98-1.3
[22] survival_2.38-3      foreign_0.8-66       gdata_2.17.0
[25] latticeExtra_0.6-26  Formula_1.2-1        geneplotter_1.46.0
[28] ggplot2_1.0.1        reshape2_1.4.1       lambda.r_1.1.7
[31] magrittr_1.5         scales_0.3.0         Hmisc_3.16-0
[34] MASS_7.3-44          splines_3.2.2        xtable_1.7-4
[37] colorspace_1.2-6     KernSmooth_2.23-15   stringi_0.5-5
[40] acepack_1.3-3.3      munsell_0.4.2

 

> source("http://bioconductor.org/biocLite.R")
Bioconductor version 3.1 (BiocInstaller 1.18.4), ?biocLite for help

My BioConductor packages seem to be the right versions:

> BiocInstaller::biocValid()
[1] TRUE

 

Any hints welcome! Thanks!

 

 

deseq2 error • 5.4k views
ADD COMMENT
1
Entering edit mode
@mikelove
Last seen 7 hours ago
United States

hi Frank,

Thanks for reporting.

Could you send me the dds object, right before DESeq() and I can try to debug?

you can email to maintainer("DESeq2")

ADD COMMENT
0
Entering edit mode

Thanks for your help. I have now sent you the file via email.

ADD REPLY
0
Entering edit mode

I was able to run this using the devel branch (I don't have R-3.2 easily accessible at the moment). The updateObject() line is for updating a DESeq2 v1.8 DESeqDataSet object to version >= 1.9.

> library("BiocParallel")
> register(MulticoreParam(12))
> dds <- ddsf2
> dds <- updateObject(dds)
> system.time({ dds <- DESeq(dds, parallel=TRUE) })
using pre-existing size factors
estimating dispersions
gene-wise dispersion estimates: 12 workers
mean-dispersion relationship
-- note: fitType='parametric', but the dispersion trend was not well captured by the
   function: y = a/x + b, and a local regression fit was automatically substituted.
   specify fitType='local' or 'mean' to avoid this message next time.
final dispersion estimates, MLE betas: 12 workers
fitting model and testing: 12 workers
    user   system  elapsed
1877.280    9.707  197.602

 

So I'm not sure what might be causing the error. Given so many samples (n=300) with lots of biological replication (20-30 per level of the factors), I often switch to using limma + voom, which is much faster for fitting.

ADD REPLY
0
Entering edit mode

ok, thanks for running that. I will install the devel branch then and give it a go. I have never used limma. Are you suggestion to do the whole DE gene analysis in limma or just the model fitting?

ADD REPLY
0
Entering edit mode

well, there aren't any relevant DESeq2 changes between release and devel. I'm not sure then what happened. It might just be some cores failing in the background for some (local?) reason, and then do.call() cannot rbind() the results together. I'm just guessing though, because i couldn't reproduce the bug with devel branch on my cluster (and I don't have 12 cores on my laptop to try with release branch).

I was suggesting to just switch to limma+voom for the whole DE gene analysis, if you encounter more problems, as it will be much faster. We find that DESeq2, edgeR and limma often have very high overlap in the DEG lists.

ADD REPLY
0
Entering edit mode

ok, maybe I should try without panellising then just to see if that makes a difference. Will take ages though. I also just tried to repeat the analysis with a data set that I have analysed previously with an older version of R/Bioconductor/DESeq2 and it also fails now, so it's not due to the data.

ADD REPLY
0
Entering edit mode

I can now confirm that the issue appears to be with using BiocParallel. Running the same analysis on the same data on a single core does not cause the problem. Probably also explains why the failure occurs sort of randomly at some point and not always at the same time in the analysis.

ADD REPLY
0
Entering edit mode

Looks like this might be a bug in BiocParallel in release. If you (or Mike) could send me the dds (valerie.obenchain@roswellpark.org) I can take a look.

Valerie

ADD REPLY
0
Entering edit mode

Is there anything I can do to test this further on my end? I tried using the DEBUG output of BiocParallel but it doesn't seem to point me to the problem. Did you get the email with the data frame that caused the problem for me? Thanks for looking into this.
 

ADD REPLY
0
Entering edit mode

Thanks for sending the ddsf2 object. I could not reproduce this error. I tried with both BiocParallel 1.2.20 and the more recent release version, 1.2.21. Both completed successfully.

register(MulticoreParam(4))
dds <- DESeq(ddsf2, parallel=TRUE)
using pre-existing size factors
estimating dispersions
gene-wise dispersion estimates: 4 workers
mean-dispersion relationship
-- note: fitType='parametric', but the dispersion trend was not well captured by the
   function: y = a/x + b, and a local regression fit was automatically substituted.
   specify fitType='local' or 'mean' to avoid this message next time.
final dispersion estimates, MLE betas: 4 workers
fitting model and testing: 4 workers

dds
class: DESeqDataSet 
dim: 3571 300 
exptData(0):
assays(4): counts mu cooks originalCounts
rownames(3571): ENSMUSG00000000001 ENSMUSG00000000058 ...
  ENSMUSG00000102070 ENSMUSG00000102117
rowRanges metadata column names(137): baseMean baseVar ... maxCooks
  replace
colnames(300): Atp2b4.CpG.8 Atp2b4.CpG.9 ... traf_2.TriDap.1
  traf_2.UNS_4hr.1
colData names(7): cell_line stimulant ... sizeFactor replaceable

 

I did see the error you report when I manually killed the job before it finished. This may point to something disrupting the job, similar to what Mike was saying about cores failing locally. Has anything changed on your system? Load, number of users?

Valerie

> sessionInfo()
R version 3.2.1 Patched (2015-07-20 r68706)
Platform: x86_64-unknown-linux-gnu (64-bit)
Running under: Fedora 22 (Twenty Two)

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets 
[8] methods   base     

other attached packages:
[1] DESeq2_1.8.1              RcppArmadillo_0.5.500.2.0
[3] Rcpp_0.12.0               GenomicRanges_1.20.6     
[5] GenomeInfoDb_1.4.2        IRanges_2.2.7            
[7] S4Vectors_0.6.5           BiocGenerics_0.14.0      
[9] BiocParallel_1.2.20      

loaded via a namespace (and not attached):
 [1] RColorBrewer_1.1-2   futile.logger_1.4.1  plyr_1.8.3          
 [4] XVector_0.8.0        futile.options_1.0.0 tools_3.2.1         
 [7] rpart_4.1-10         digest_0.6.8         RSQLite_1.0.0       
[10] annotate_1.46.1      gtable_0.1.2         lattice_0.20-33     
[13] DBI_0.3.1            proto_0.3-10         gridExtra_2.0.0     
[16] genefilter_1.50.0    stringr_1.0.0        cluster_2.0.3       
[19] locfit_1.5-9.1       nnet_7.3-11          grid_3.2.1          
[22] Biobase_2.28.0       AnnotationDbi_1.30.1 XML_3.98-1.3        
[25] survival_2.38-3      foreign_0.8-66       latticeExtra_0.6-26 
[28] Formula_1.2-1        geneplotter_1.46.0   ggplot2_1.0.1       
[31] reshape2_1.4.1       lambda.r_1.1.7       magrittr_1.5        
[34] scales_0.3.0         Hmisc_3.16-0         MASS_7.3-44         
[37] splines_3.2.1        xtable_1.7-4         colorspace_1.2-6    
[40] stringi_0.5-5        acepack_1.3-3.3      munsell_0.4.2       

 

ADD REPLY
0
Entering edit mode

Thanks Valerie for looking into this! I don't think anything should have changed on the computing cluster here. I will get our IT support team to have a look too, now that I know what to look out for. Thanks a lot!

ADD REPLY

Login before adding your answer.

Traffic: 375 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6