GSVA Problem
1
0
Entering edit mode
@4698c505
Last seen 2.9 years ago
Hong Kong

When I run gsva function, the error occured as follows:

Code should be placed in three backticks as shown below

> gsvascore <- gsva(data, geneset, method="gsva", parallel.sz = 2)   
Setting parallel calculations through a MulticoreParam back-end
with workers=2 and tasks=100.
Estimating GSVA scores for 186 gene sets.
Estimating ECDFs with Gaussian kernels
Estimating ECDFs in parallel
iteration: Error in serialize(data, node$con, xdr = FALSE) : 
  error writing to connection
In addition: Warning messages:
1: In .filterFeatures(expr, method) :
  3068 genes with constant expression values throuhgout the samples.
2: In .filterFeatures(expr, method) :
  Since argument method!="ssgsea", genes with constant expression values are discarded.

Error in serialize(data, node$con, xdr = FALSE) : 
  error writing to connection
> sessionInfo( )
R version 4.0.4 (2021-02-15)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04.2 LTS

Matrix products: default
BLAS/LAPACK: /usr/local/lib/libopenblas.so.0

locale:
 [1] LC_CTYPE=en_US.UTF-8          LC_NUMERIC=C                  LC_TIME=en_US.UTF-8           LC_COLLATE=en_US.UTF-8       
 [5] LC_MONETARY=en_US.UTF-8       LC_MESSAGES=en_US.UTF-8       LC_PAPER=en_US.UTF-8          LC_NAME=en_US.UTF-8          
 [9] LC_ADDRESS=en_US.UTF-8        LC_TELEPHONE=en_US.UTF-8      LC_MEASUREMENT=en_US.UTF-8    LC_IDENTIFICATION=en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] future_1.21.0       BiocParallel_1.24.1 stringr_1.4.0       msigdbr_7.4.1       limma_3.46.0        GSVA_1.38.2        
[7] dplyr_1.0.6         SeuratObject_4.0.1  Seurat_4.0.1       

loaded via a namespace (and not attached):
  [1] Rtsne_0.15                  colorspace_2.0-1            deldir_0.2-10               ellipsis_0.3.2             
  [5] ggridges_0.5.3              XVector_0.30.0              GenomicRanges_1.42.0        rstudioapi_0.13            
  [9] spatstat.data_2.1-0         leiden_0.3.7                listenv_0.8.0               ggrepel_0.9.1              
 [13] bit64_4.0.5                 AnnotationDbi_1.52.0        fansi_0.4.2                 codetools_0.2-18           
 [17] splines_4.0.4               cachem_1.0.4                polyclip_1.10-0             jsonlite_1.7.2             
 [21] rJava_1.0-4                 annotate_1.68.0             ica_1.0-2                   cluster_2.1.2              
 [25] png_0.1-7                   graph_1.68.0                uwot_0.1.10                 shiny_1.6.0                
 [29] sctransform_0.3.2           spatstat.sparse_2.0-0       compiler_4.0.4              httr_1.4.2                 
 [33] Matrix_1.3-3                fastmap_1.1.0               lazyeval_0.2.2              cli_2.5.0                  
 [37] later_1.2.0                 htmltools_0.5.1.1           tools_4.0.4                 igraph_1.2.6               
 [41] GenomeInfoDbData_1.2.4      gtable_0.3.0                glue_1.4.2                  RANN_2.6.1                 
 [45] reshape2_1.4.4              Rcpp_1.0.6                  scattermore_0.7             Biobase_2.50.0             
 [49] vctrs_0.3.8                 babelgene_21.4              nlme_3.1-152                lmtest_0.9-38              
 [53] globals_0.14.0              xlsxjars_0.6.1              mime_0.10                   miniUI_0.1.1.1             
 [57] lifecycle_1.0.0             irlba_2.3.3                 XML_3.99-0.6                xlsx_0.6.5                 
 [61] goftest_1.2-2               zlibbioc_1.36.0             MASS_7.3-54                 zoo_1.8-9                  
 [65] scales_1.1.1                spatstat.core_2.1-2         MatrixGenerics_1.2.1        promises_1.2.0.1           
 [69] spatstat.utils_2.1-0        SummarizedExperiment_1.20.0 parallel_4.0.4              RColorBrewer_1.1-2         
 [73] memoise_2.0.0               reticulate_1.20             pbapply_1.4-3               gridExtra_2.3              
 [77] ggplot2_3.3.3               rpart_4.1-15                stringi_1.6.1               RSQLite_2.2.7              
 [81] S4Vectors_0.28.1            BiocGenerics_0.36.1         GenomeInfoDb_1.26.7         bitops_1.0-7               
 [85] rlang_0.4.11                pkgconfig_2.0.3             matrixStats_0.58.0          lattice_0.20-44            
 [89] ROCR_1.0-11                 purrr_0.3.4                 tensor_1.5                  patchwork_1.1.1            
 [93] htmlwidgets_1.5.3           bit_4.0.4                   cowplot_1.1.1               tidyselect_1.1.1           
 [97] GSEABase_1.52.1             parallelly_1.25.0           RcppAnnoy_0.0.18            plyr_1.8.6                 
[101] magrittr_2.0.1              R6_2.5.0                    IRanges_2.24.1              generics_0.1.0             
[105] DelayedArray_0.16.3         DBI_1.1.1                   pillar_1.6.0                mgcv_1.8-35                
[109] fitdistrplus_1.1-3          RCurl_1.98-1.3              survival_3.2-11             abind_1.4-5                
[113] tibble_3.1.1                future.apply_1.7.0          crayon_1.4.1                KernSmooth_2.23-20         
[117] utf8_1.2.1                  spatstat.geom_2.1-0         plotly_4.9.3                grid_4.0.4                 
[121] data.table_1.14.0           blob_1.2.1                  digest_0.6.27               xtable_1.8-4               
[125] tidyr_1.1.3                 httpuv_1.6.1                stats4_4.0.4                munsell_0.5.0              
[129] viridisLite_0.4.0
GSVA • 3.0k views
ADD COMMENT
0
Entering edit mode
Robert Castelo ★ 3.3k
@rcastelo
Last seen 1 day ago
Barcelona/Universitat Pompeu Fabra

hi, i think the error must be caused by something specific to your data because the following simulation works without problems:

p <- 10000                                                                                                                     
n <- 1000                                                                                                                      
sizeGeneSets <- sample(5:500, size=186, replace=TRUE)
geneSets <- lapply(as.list(sizeGeneSets), sample,
                   x=paste0("g", 1:p), replace=FALSE)
names(geneSets) <- paste0("gs", 1:length(geneSets))
y <- matrix(rnorm(n*p), nrow=p, ncol=n,
            dimnames=list(paste("g", 1:p, sep="") , paste("s", 1:n, sep="")))
dim(y)
[1] 10000  1000

library(GSVA)
gsva_es <- gsva(y, geneSets, parallel.sz=2)
Setting parallel calculations through a MulticoreParam back-end
with workers=2 and tasks=100.
Estimating GSVA scores for 186 gene sets.
Estimating ECDFs with Gaussian kernels
Estimating ECDFs in parallel
iteration: 100
  |======================================================================| 100%
dim(gsva_es)
[1]  186 1000
gsva_es[1:5, 1:5]
              s1          s2          s3          s4            s5
gs1  0.058409970 -0.03890251  0.02101381  0.08804973  0.0128590243
gs2  0.060465645  0.01435706  0.15677575 -0.10163160 -0.0451630627
gs3 -0.044192197 -0.04761677 -0.02704921  0.05394600  0.0007633298
gs4  0.006757128 -0.01926865 -0.06437585 -0.01344462 -0.0595236762
gs5 -0.016487255 -0.38972381 -0.11386866  0.33648400 -0.2745478370
sessionInfo()                                                                                                                  
R version 4.0.3 (2020-10-10)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Catalina 10.15.7

Matrix products: default
BLAS:   /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRblas.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] GSVA_1.38.2    nvimcom_0.9-28 colorout_1.2-2

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.6                  compiler_4.0.3              GenomeInfoDb_1.26.7        
 [4] XVector_0.30.0              MatrixGenerics_1.2.1        bitops_1.0-7               
 [7] tools_4.0.3                 zlibbioc_1.36.0             bit_4.0.4                  
[10] lattice_0.20-41             annotate_1.68.0             RSQLite_2.2.7              
[13] memoise_2.0.0               rlang_0.4.11                Matrix_1.3-2               
[16] graph_1.68.0                DelayedArray_0.16.3         DBI_1.1.1                  
[19] parallel_4.0.3              fastmap_1.1.0               GenomeInfoDbData_1.2.4     
[22] httr_1.4.2                  S4Vectors_0.28.1            vctrs_0.3.8                
[25] IRanges_2.24.1              grid_4.0.3                  stats4_4.0.3               
[28] bit64_4.0.5                 GSEABase_1.52.1             Biobase_2.50.0             
[31] R6_2.5.0                    AnnotationDbi_1.52.0        BiocParallel_1.24.1        
[34] XML_3.99-0.6                blob_1.2.1                  matrixStats_0.58.0         
[37] BiocGenerics_0.36.1         GenomicRanges_1.42.0        SummarizedExperiment_1.20.0
[40] xtable_1.8-4                RCurl_1.98-1.3              cachem_1.0.4

i see that the GSVA software is giving the following warnings:

1: In .filterFeatures(expr, method) :
  3068 genes with constant expression values throuhgout the samples.
2: In .filterFeatures(expr, method) :
  Since argument method!="ssgsea", genes with constant expression values are discarded.

which means that 3068 genes with are being discarded from the input expression data matrix, could you report what are the original dimensions of the input expression data matrix? additionally, could you report the output of the function traceback() called right after the error?

ADD COMMENT

Login before adding your answer.

Traffic: 691 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6