GSVA Error in .mapGeneSetsToFeatures(mapped.gset.idx.list, rownames(expr))
1
0
Entering edit mode
Siyuan • 0
@siyuan-24740
Last seen 6 months ago

Hi,

I have a problem with GSVA. I tried to run Msigdb (KEGG gene set) in the GSVA using microarray data. This code worked several days ago but today when I ran it again, there was an error showed below:


 #head(exp,5)
              sample1       samples2   sample3  sample4    sample5      sample6
A1CF     2.589551   2.656472   2.524491   2.748733   2.423472   2.618552
A2M     10.299896   9.196994   8.912481   9.664004   9.301919   9.829284
A2ML1    2.870450   3.084727   3.044007   3.166133   3.211959   3.292066
A4GALT   4.173940   5.132295   4.348393   4.229899   4.569535   4.087214


library(GSEABase)
library(GSVA)

msigdb_GMTs <- "msigdb_v7.2_GMTs"
msigdb <- "c2.cp.kegg.v7.2.symbols.gmt"

geneset <- getGmt(file.path(msigdb_GMTs, msigdb))  

es.max <- gsva(exp, geneset, 
               mx.diff=FALSE, verbose=FALSE, 
               parallel.sz=1)

# Error in .mapGeneSetsToFeatures(mapped.gset.idx.list, rownames(expr)) : 
#  No identifiers in the gene sets could be matched to the identifiers in the expression data. 

sessionInfo( )
R version 4.0.3 (2020-10-10)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Big Sur 10.16

Matrix products: default
LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets 
[8] methods   base     

other attached packages:
 [1] forcats_0.5.1        stringr_1.4.0        dplyr_1.0.4         
 [4] purrr_0.3.4          readr_1.4.0          tidyr_1.1.2         
 [7] tibble_3.0.6         ggplot2_3.3.3        tidyverse_1.3.0     
[10] devtools_2.3.2       usethis_2.0.0        GSVA_1.39.16        
[13] GSEABase_1.52.1      graph_1.68.0         annotate_1.68.0     
[16] XML_3.99-0.5         AnnotationDbi_1.52.0 IRanges_2.24.1      
[19] S4Vectors_0.28.1     Biobase_2.50.0       BiocGenerics_0.36.0 

loaded via a namespace (and not attached):
 [1] bitops_1.0-6                matrixStats_0.58.0         
 [3] fs_1.5.0                    lubridate_1.7.9.2          
 [5] bit64_4.0.5                 httr_1.4.2                 
 [7] rprojroot_2.0.2             GenomeInfoDb_1.26.2        
 [9] tools_4.0.3                 backports_1.2.1            
[11] R6_2.5.0                    DBI_1.1.1                  
[13] colorspace_2.0-0            withr_2.4.1                
[15] tidyselect_1.1.0            prettyunits_1.1.1          
[17] processx_3.4.5              bit_4.0.4                  
[19] curl_4.3                    compiler_4.0.3             
[21] rvest_0.3.6                 cli_2.3.0                  
[23] xml2_1.3.2                  desc_1.2.0                 
[25] DelayedArray_0.16.1         scales_1.1.1               
[27] callr_3.5.1                 XVector_0.30.0             
[29] pkgconfig_2.0.3             sessioninfo_1.1.1          
[31] MatrixGenerics_1.2.1        dbplyr_2.1.0               
[33] fastmap_1.1.0               readxl_1.3.1               
[35] rlang_0.4.10                rstudioapi_0.13            
[37] RSQLite_2.2.3               generics_0.1.0             
[39] jsonlite_1.7.2              BiocParallel_1.24.1        
[41] RCurl_1.98-1.2              magrittr_2.0.1             
[43] GenomeInfoDbData_1.2.4      Matrix_1.3-2               
[45] Rcpp_1.0.6                  munsell_0.5.0              
[47] lifecycle_0.2.0             stringi_1.5.3              
[49] SummarizedExperiment_1.20.0 zlibbioc_1.36.0            
[51] pkgbuild_1.2.0              grid_4.0.3                 
[53] blob_1.2.1                  crayon_1.4.1               
[55] lattice_0.20-41             haven_2.3.1                
[57] hms_1.0.0                   ps_1.5.0                   
[59] pillar_1.4.7                GenomicRanges_1.42.0       
[61] pkgload_1.1.0               reprex_1.0.0               
[63] glue_1.4.2                  remotes_2.2.0              
[65] BiocManager_1.30.10         modelr_0.1.8               
[67] vctrs_0.3.6                 cellranger_1.1.0           
[69] testthat_3.0.1              gtable_0.3.0               
[71] assertthat_0.2.1            cachem_1.0.3               
[73] xtable_1.8-4                broom_0.7.4                
[75] memoise_2.0.0               ellipsis_0.3.1             
>

I would appreciate it if you could help me to fix this robust problem! Thank you.

Siyuan

GSVA • 487 views
ADD COMMENT
0
Entering edit mode
Robert Castelo ★ 2.7k
@rcastelo
Last seen 12 weeks ago
Barcelona/Universitat Pompeu Fabra

hi,

thanks for reporting this problem, is has been fixed in the release version of GSVA 1.38.2 and in devel 1.39.17. By the way, your session information shows that you are currently using the development version of GSVA. Beware that, in general, the development version of Bioconductor packages may not work as expected because developers work on that version adding new features or refactoring code, which may lead to unexpected behavior of the package. So, unless the end user wants to beta-test new features, he/she should be using the release version.

a comment on your code, when you use the function getGmt() to read a GMT file of gene sets defined using gene symbols, I'd set the argument geneIdType=SymbolIdentifier() so that the resulting GeneSetCollection object has the additional bit of metadata that tells the type of identifier being used:

geneset <- getGmt(file.path(msigdb_GMTs, msigdb), , geneIdType=SymbolIdentifier())

this becomes useful if you need to map those identifiers to another type of identifier.

cheers,

robert.

ADD COMMENT
0
Entering edit mode

Dear Robert,

Thanks for your reply! Unfortunately, I still have this "No identifiers" problem after I updated my package to GSVA 1.38.2. Are there some problems in my computer?

I appreciate your help!

Best wishes

ADD REPLY
0
Entering edit mode

hi,

the following code, which was reproducing the bug before, now runs fine:

library(GSEABase)
library(GSVA)
library(GSVAdata)
library(hgu95a.db)
library(annotate)

geneset <- getGmt("c2.cp.kegg.v7.2.symbols.gmt", geneIdType=SymbolIdentifier())

data(leukemia)
syms <- getSYMBOL(featureNames(leukemia_eset), "hgu95a.db")
exps <- exprs(leukemia_eset)
rownames(exps) <- syms
exps[1:5, 1:5]
        CL2001011101AA.CEL CL2001011102AA.CEL CL2001011104AA.CEL
MAPK3            11.354426          10.932543          11.185906
TIE1              9.185470           8.823661           8.687186
CYP2C19           7.806993           8.127591           7.842353
CXCR5            10.164370          10.048514          10.006014
CXCR5             9.642389           9.834265           9.750938
        CL2001011105AA.CEL CL2001011109AA.CEL
MAPK3            11.251631          11.540745
TIE1              8.958305           9.762877
CYP2C19           8.319227           8.334177
CXCR5            10.474046          10.115543
CXCR5            10.430205          10.066628
es <- gsva(exps, geneset)
Estimating GSVA scores for 186 gene sets.
Estimating ECDFs with Gaussian kernels
  |======================================================================| 100%
sessionInfo()                                                                           
R version 4.0.3 (2020-10-10)                                                              
Platform: x86_64-apple-darwin17.0 (64-bit)                                                
Running under: macOS Catalina 10.15.7                                                     

Matrix products: default                                                                  
BLAS:   /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRblas.dylib         
LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib       

locale:                                                                                   
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8                         

attached base packages:                                                                   
[1] stats4    parallel  stats     graphics  grDevices utils     datasets                  
[8] methods   base                                                                        

other attached packages:                                                                  
 [1] GSVAdata_1.26.0      hgu95a.db_3.2.3      org.Hs.eg.db_3.12.0 
 [4] GSVA_1.38.2          GSEABase_1.52.1      graph_1.68.0        
 [7] annotate_1.68.0      XML_3.99-0.5         AnnotationDbi_1.52.0
[10] IRanges_2.24.1       S4Vectors_0.28.1     Biobase_2.50.0      
[13] BiocGenerics_0.36.0  nvimcom_0.9-28       colorout_1.2-2      
loaded via a namespace (and not attached):
 [1] Rcpp_1.0.6                  compiler_4.0.3             
 [3] GenomeInfoDb_1.26.2         XVector_0.30.0             
 [5] MatrixGenerics_1.2.1        bitops_1.0-6               
 [7] tools_4.0.3                 zlibbioc_1.36.0            
 [9] bit_4.0.4                   lattice_0.20-41            
[11] RSQLite_2.2.3               memoise_2.0.0              
[13] pkgconfig_2.0.3             rlang_0.4.10               
[15] Matrix_1.3-2                DelayedArray_0.16.1        
[17] DBI_1.1.1                   fastmap_1.1.0              
[19] GenomeInfoDbData_1.2.4      httr_1.4.2                 
[21] vctrs_0.3.6                 grid_4.0.3                 
[23] bit64_4.0.5                 R6_2.5.0                   
[25] BiocParallel_1.24.1         blob_1.2.1                 
[27] matrixStats_0.58.0          GenomicRanges_1.42.0       
[29] SummarizedExperiment_1.20.0 xtable_1.8-4               
[31] RCurl_1.98-1.2              cachem_1.0.3

so, for me to fix the problem, you would have to provide code and data reproducing it.

cheers,

robert.

ADD REPLY
0
Entering edit mode

Dear Robert,

Thank you for your help! I run it again. This time it did work!

Best wishes

ADD REPLY

Login before adding your answer.

Traffic: 254 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6