GSVA (1.18.0) segfaulting on small dataset
1
0
Entering edit mode
Rajarshi Guha ▴ 120
@rajarshi-guha-3531
Last seen 8.0 years ago

I'm using GSVA (v1.18.0) on R 3.2.2 and trying to perform a gene set analysis on a small data set (75 genes, 6 samples). However, whenever I set the bootstrap rounds to > 10, I end up with a segfault. As far as I can tell there is no issue with the data itself (e.g., genes with sd = 0). The details of my R session I provided below. Any pointers would be appreciated.(I face the same problem with R 3.2.3 on OS X 10.9.5)

My invocation I'm using is given below, and the input data (gdat, with row names being the Entrez Gene ID's) can be obtained from 

 

library(org.Hs.eg.db)
library(GSVA)
library(GSEABase)
library(GSVAdata)
data(c2BroadSets)
gse <- gsva(gdat,c2BroadSets,rnaseq=FALSE,min.sz=4,no.bootstraps=1000)

After a few gene sets it segfaults with the following error:

 *** caught segfault ***
address 0x7f01da2843a0, cause 'memory not mapped'
Traceback:
 1: .C("matrix_density_R", as.double(t(expr[, sample.idxs, drop = FALSE])),     as.double(t(expr)), R = double(n.test.samples * n.genes),     n.density.samples, n.test.samples, n.genes, as.integer(rnaseq))
 2: compute.gene.density(expr, sample.idxs, rnaseq, kernel)
 3: compute.geneset.es(expr, gset.idx.list, sample(n.samples, bootstrap.nsamples,     replace = T), rnaseq = rnaseq, abs.ranking = abs.ranking,     mx.diff = mx.diff, tau = tau, kernel = kernel, verbose = verbose)
 4: .gsva(expr, mapped.gset.idx.list, method, rnaseq, abs.ranking,     no.bootstraps, bootstrap.percent, parallel.sz, parallel.type,     mx.diff, tau, kernel, ssgsea.norm, verbose)
 5: .local(expr, gset.idx.list, ...)
 6: gsva(gdat, c2BroadSets, rnaseq = FALSE, min.sz = 4, no.bootstraps = 1000)
 7: gsva(gdat, c2BroadSets, rnaseq = FALSE, min.sz = 4, no.bootstraps = 1000)

The output of sessionInfo() is given below

> sessionInfo()

R version 3.2.2 (2015-08-14)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Scientific Linux release 6.7 (Carbon)

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8      
 [8] LC_NAME=C                  LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] GSVAdata_1.6.0       hgu95a.db_3.2.2      GSEABase_1.32.0      graph_1.48.0         annotate_1.48.0      XML_3.98-1.3         org.Hs.eg.db_3.2.3   RSQLite_1.0.0        DBI_0.3.1           
[10] AnnotationDbi_1.32.0 IRanges_1.22.10      Biobase_2.24.0       BiocGenerics_0.16.1  GSVA_1.18.0         

loaded via a namespace (and not attached):
[1] xtable_1.8-2     S4Vectors_0.8.11 tools_3.2.2     

 

gsva segfault • 1.4k views
ADD COMMENT
0
Entering edit mode
Robert Castelo ★ 3.3k
@rcastelo
Last seen 2 days ago
Barcelona/Universitat Pompeu Fabra

Hi,

the expression data data set you provide cannot be used because it has duplicated gene identifiers (e.g., gene '780'). However, using one example data set (leukemia) from the GSVA package I cannot reproduce the problem in a similar setup like yours in linux:

library(org.Hs.eg.db)
library(GSVA)
library(GSEABase)
library(GSVAdata)
data(c2BroadSets)
data(leukemia) ## loads the 'leukemia_eset' ExpressionSet object
gse <- gsva(leukemia_eset,c2BroadSets,rnaseq=FALSE,min.sz=4,no.bootstraps=10) ## this runs fine

sessionInfo()

R version 3.2.2 (2015-08-14)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Fedora release 12 (Constantine)

locale:
 [1] LC_CTYPE=en_US.UTF8       LC_NUMERIC=C              LC_TIME=en_US.UTF8        LC_COLLATE=en_US.UTF8    
 [5] LC_MONETARY=en_US.UTF8    LC_MESSAGES=en_US.UTF8    LC_PAPER=en_US.UTF8       LC_NAME=C                
 [9] LC_ADDRESS=C              LC_TELEPHONE=C            LC_MEASUREMENT=en_US.UTF8 LC_IDENTIFICATION=C      

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] GSVAdata_1.6.0       hgu95a.db_3.2.2      GSEABase_1.32.0      graph_1.48.0         annotate_1.48.0     
 [6] XML_3.98-1.3         GSVA_1.18.0          org.Hs.eg.db_3.2.3   RSQLite_1.0.0        DBI_0.3.1           
[11] AnnotationDbi_1.32.3 IRanges_2.4.7        S4Vectors_0.8.11     Biobase_2.30.0       BiocGenerics_0.16.1
[16] vimcom_1.2-3         setwidth_1.0-4       colorout_1.1-0      

loaded via a namespace (and not attached):
[1] xtable_1.8-2 tools_3.2.2

In any case, it is recommended that you upgrade to the latest BioC release 3.3 that just came out yesterday; see http://www.bioconductor.org/news/bioc_3_3_release.

cheers,

robert.

ADD COMMENT
0
Entering edit mode

Thanks for taking a look. Unfortunately, even after upgrading to R 3.3 and the latest bioc release, I get the same segfault. I've put up a proper data file as a github gist - would you mind running this to see if it fails for you?

ADD REPLY
0
Entering edit mode

hi, at the moment i have a latest BioC 3.3 installation in a mac osx "el capitan" laptop and it runs fine also with this file, here's my session information.

version 3.3.0 (2016-05-03)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X 10.11.4 (El Capitan)

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets
[8] methods   base     

other attached packages:
 [1] GSVAdata_1.8.0       hgu95a.db_3.2.2      GSVA_1.20.0         
 [4] BiocInstaller_1.22.1 GSEABase_1.34.0      graph_1.50.0        
 [7] annotate_1.50.0      XML_3.98-1.4         org.Hs.eg.db_3.3.0  
[10] AnnotationDbi_1.34.0 IRanges_2.6.0        S4Vectors_0.10.0    
[13] Biobase_2.32.0       BiocGenerics_0.18.0  setwidth_1.0-4      
[16] colorout_1.1-2      

loaded via a namespace (and not attached):
[1] xtable_1.8-2  DBI_0.4-1     RSQLite_1.0.0 tools_3.3.0  


cheers,

robert.

ADD REPLY

Login before adding your answer.

Traffic: 610 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6