[GSVM] bootstrapping error
1
0
Entering edit mode
mforde84 ▴ 20
@mforde84-12150
Last seen 7.3 years ago

Hi,

I'm trying to run a GSVA analysis with bootstrapping, however it appears that I'm missing some dependancy that I haven't been able to find on google. Sorry for the rudimentary question, but would someone be so kind as to help me find the appropriate package?

> enrichment.scores <- gsva(logCPMrbe.flt, gene.sets, method = "gsva", mx.diff = TRUE, bootstrap.percent=.632, no.bootstraps=2, verbose=TRUE, rnaseq=TRUE, parallel.sz=1)$es.obs
Estimating GSVA scores for 2999 gene sets.
Computing observed enrichment scores
Estimating ECDFs in rnaseq data with Poisson kernels
Using parallel with 1 cores
  |======================================================================| 100%
Computing bootstrap enrichment scores
Parallel bootstrap...
bootstrap cycle  1
Error in checkForRemoteErrors(lapply(cl, recvResult)) :
  one node produced an error: could not find function "compute.geneset.es"
In addition: Warning message:
closing unused connection 3 (<-localhost:11439)
> sessionInfo()
R version 3.3.2 (2016-10-31)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.1 LTS

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets
[8] methods   base     

other attached packages:
 [1] GSVAdata_1.10.0      hgu95a.db_3.2.3      org.Hs.eg.db_3.4.0  
 [4] GSEABase_1.36.0      graph_1.52.0         annotate_1.52.1     
 [7] XML_3.98-1.5         AnnotationDbi_1.36.0 IRanges_2.8.1       
[10] S4Vectors_0.12.1     Biobase_2.34.0       BiocGenerics_0.20.0
[13] GSVA_1.22.0          rlecuyer_0.3-4       snow_0.4-2          
[16] limma_3.30.7         BiocInstaller_1.24.0

loaded via a namespace (and not attached):
[1] Rcpp_0.12.8    xtable_1.8-2   tools_3.3.2    DBI_0.5-1      digest_0.6.11
[6] bitops_1.0-6   RCurl_1.95-4.8 memoise_1.0.0  RSQLite_1.1-2

GSVM bootstrap snow • 1.7k views
ADD COMMENT
1
Entering edit mode
Robert Castelo ★ 3.4k
@rcastelo
Last seen 1 day ago
Barcelona/Universitat Pompeu Fabra

hi,

i think there's a problem in the way GSVA handles the calculation of bootstrapped scores in parallel. i've submitted a bugfix that should be available in GSVA version 1.22.2 in the next 24/48 hours. please let me know if it works once you can try it out.

let me add that part of the problem is that while it seems that you do not want to do calculations in parallel by saying 'parallel.sz=1', GSVA tries to do them anyway, this is one of the things that i've fixed in this version 1.22.2.

thanks for bringing up the problem to our attention,

robert.

ADD COMMENT
0
Entering edit mode

thanks robert.

i set parallel.sz to 1 for testing purposes. i think i tried with multiple cores as well, but got a similar error. i might be mistaken though (memory comes and goes). i'll give it a go later in the day with multiple cores, see if that helps.

either way, thanks for looking into this and helping to fix the problem.

martin

ADD REPLY
0
Entering edit mode

hi robert,

unfortunately, the fix didn't work, still getting the same error:

> enrichment.scores <- gsva(logCPMrbe.flt, gene.sets, method = "gsva", mx.diff = TRUE, verbose=TRUE, rnaseq=TRUE, no.bootstraps=1000, bootstrap.percent = .632, parallel.sz=8)$es.obs
Estimating GSVA scores for 2999 gene sets.
Computing observed enrichment scores
Estimating ECDFs in rnaseq data with Poisson kernels
Using parallel with 8 cores
  |======================================================================| 100%
Computing bootstrap enrichment scores
Parallel bootstrap...
bootstrap cycle  1
Error in checkForRemoteErrors(lapply(cl, recvResult)) :
  8 nodes produced errors; first error: could not find function "compute.geneset.es"
> sessionInfo()
R version 3.3.2 (2016-10-31)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.1 LTS

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets
[8] methods   base     

other attached packages:
 [1] BiocInstaller_1.24.0 rlecuyer_0.3-4       snow_0.4-2          
 [4] GSVAdata_1.10.0      hgu95a.db_3.2.3      org.Hs.eg.db_3.4.0  
 [7] GSEABase_1.36.0      graph_1.52.0         annotate_1.52.1     
[10] XML_3.98-1.5         AnnotationDbi_1.36.1 IRanges_2.8.1       
[13] S4Vectors_0.12.1     Biobase_2.34.0       BiocGenerics_0.20.0
[16] GSVA_1.22.2          limma_3.30.8        

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.9        splines_3.3.2      xtable_1.8-2       lattice_0.20-34   
 [5] DESeq_1.26.0       tools_3.3.2        grid_3.3.2         DBI_0.5-1         
 [9] genefilter_1.56.0  survival_2.40-1    digest_0.6.11      Matrix_1.2-7.1    
[13] geneplotter_1.52.0 RColorBrewer_1.1-2 bitops_1.0-6       RCurl_1.95-4.8    
[17] memoise_1.0.0      RSQLite_1.1-2

ADD REPLY
0
Entering edit mode

hi Martin,

i'm sorry about that, it was my fault, i thought i understood exactly what the problem was and didn't check whether the modification i pushed to the 1.22.2 version was really working under the conditions you were using. after checking it again, i realized the problem was more convoluted but now i've submitted a fix under version 1.22.3 that it works at least in my linux computer. this should become available in the next 24/48 hours but if you need to use it earlier and have the possibility to build R packages, you can check out with SVN the 1.22.3 version doing:

svn co https://hedgehog.fhcrc.org/bioconductor/branches/RELEASE_3_4/madman/Rpacks/GSVA GSVA
R CMD build --no-build-vignettes GSVA ## --no-build-vignettes option builds fast w/o vignettes
R CMD INSTALL GSVA_1.22.3.tar.gz

let me know if this new version works for you.

a couple of further remarks about your code. if you are running the bootstraps argument, i think you may want to get the full list result and not just the 'es.obs' element of the resulting list since all the additional information from the bootstrap calculations is on the elements 'bootstrap' and 'p.vals.sign'.

i also see that you set 'rnaseq=TRUE' and the input expression matrix is called 'logCPM', which makes me think you are providing 'logCPM' values to the 'gsva()' function. if this is the case, then you should set 'rnaseq=FALSE' (default) because 'logCPM' values are continuous, while 'rnaseq=TRUE' is only necessary when the input matrix is a matrix of integer counts from a RNA-seq experiment.

cheers,

robert.

ADD REPLY
0
Entering edit mode

thanks for the clarifications. the svn repo is asking for authentication, so ill just wait till the new version gets pushed to bioclite.

ADD REPLY
0
Entering edit mode

Thankyou. Everything works now. Appreciate the help.

ADD REPLY
0
Entering edit mode

Great! Please remember next time to add 'GSVA' as tag to the question to make it easier for me to spot it.

ADD REPLY

Login before adding your answer.

Traffic: 731 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6