Only positive enrichment score in GSEA
0
1
Entering edit mode
enee ▴ 10
@83e76f32
Last seen 9 months ago
Italy

Hi everyone, I am a novice in the enrichment analyses and I have a problem with the gseGO function of the clusterprofiler package in R. Basically I'm analyzing mouse expression data and when I go to run the analysis with my code there seem to be only enriched pathways (with positive NES). I don't understand why if I use the human annotation org.Hs.eg.db my results and graphs come out as I expect them to. The first plot comes from the following code with mouse annotation (org.Mm.eg.db) and the second plot comes from the code with the human annotation.

GO_gsea <- function(df) { 
                     library(clusterProfiler) 
                     library(org.Mm.eg.db) 
                     geneList = df[,2] 
                     names(geneList) = as.character(df[,7]) 
                     geneList = sort(geneList, decreasing = TRUE) 
                     data_GO_GSEA <- gseGO(geneList = geneList, OrgDb = org.Mm.eg.db, ont = "BP", 
                                                     minGSSize = 3, maxGSSize = 800, pvalueCutoff = 0.05, verbose = TRUE,  
                                                     keyType = "SYMBOL", pAdjustMethod = "BH", eps = 0) }
CeO2vsDiff1_gsea <- GO_gsea(CeO2vsDiff1) 
dotplot(CeO2vsDiff1_gsea, showCategory=10, split=".sign") + facet_grid(.~.sign)

sessionInfo( )
R version 4.3.1 (2023-06-16)
Platform: aarch64-apple-darwin20 (64-bit)
Running under: macOS Sonoma 14.4

Matrix products: default
BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib 
LAPACK: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.11.0

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

time zone: Europe/Rome
tzcode source: internal

attached base packages:
[1] stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] BiocManager_1.30.22   org.Hs.eg.db_3.17.0   lubridate_1.9.3       forcats_1.0.0        
 [5] stringr_1.5.1         dplyr_1.1.4           purrr_1.0.2           readr_2.1.5          
 [9] tidyr_1.3.1           tibble_3.2.1          tidyverse_2.0.0       org.Mm.eg.db_3.17.0  
[13] AnnotationDbi_1.62.2  IRanges_2.34.1        S4Vectors_0.38.2      Biobase_2.60.0       
[17] BiocGenerics_0.46.0   ggplot2_3.5.0         enrichplot_1.20.3     clusterProfiler_4.8.3

loaded via a namespace (and not attached):
 [1] DBI_1.2.2               bitops_1.0-7            gson_0.1.0             
 [4] shadowtext_0.1.3        gridExtra_2.3           rlang_1.1.3            
 [7] magrittr_2.0.3          DOSE_3.26.2             compiler_4.3.1         
[10] RSQLite_2.3.5           png_0.1-8               vctrs_0.6.5            
[13] reshape2_1.4.4          pkgconfig_2.0.3         crayon_1.5.2           
[16] fastmap_1.1.1           XVector_0.40.0          labeling_0.4.3         
[19] ggraph_2.2.1            utf8_1.2.4              HDO.db_0.99.1          
[22] tzdb_0.4.0              bit_4.0.5               zlibbioc_1.46.0        
[25] cachem_1.0.8            aplot_0.2.2             jsonlite_1.8.8         
[28] GenomeInfoDb_1.36.4     blob_1.2.4              BiocParallel_1.34.2    
[31] tweenr_2.0.3            parallel_4.3.1          R6_2.5.1               
[34] stringi_1.8.3           RColorBrewer_1.1-3      GOSemSim_2.26.1        
[37] Rcpp_1.0.12             downloader_0.4          timechange_0.3.0       
[40] Matrix_1.6-5            splines_4.3.1           igraph_2.0.3           
[43] tidyselect_1.2.1        qvalue_2.32.0           rstudioapi_0.15.0      
[46] viridis_0.6.5           codetools_0.2-19        lattice_0.22-6         
[49] plyr_1.8.9              treeio_1.24.3           withr_3.0.0            
[52] KEGGREST_1.40.1         gridGraphics_0.5-1      scatterpie_0.2.1       
[55] polyclip_1.10-6         Biostrings_2.68.1       pillar_1.9.0           
[58] ggtree_3.8.2            ggfun_0.1.4             generics_0.1.3         
[61] RCurl_1.98-1.14         hms_1.1.3               munsell_0.5.0          
[64] scales_1.3.0            tidytree_0.4.6          glue_1.7.0             
[67] lazyeval_0.2.2          tools_4.3.1             data.table_1.15.2      
[70] fgsea_1.26.0            fs_1.6.3                graphlayouts_1.1.1     
[73] fastmatch_1.1-4         tidygraph_1.3.1         cowplot_1.1.3          
[76] grid_4.3.1              ape_5.7-1               colorspace_2.1-0       
[79] nlme_3.1-164            GenomeInfoDbData_1.2.10 patchwork_1.2.0        
[82] ggforce_0.4.2           cli_3.6.2               fansi_1.0.6            
[85] viridisLite_0.4.2       gtable_0.3.4            yulab.utils_0.1.4      
[88] digest_0.6.35           ggrepel_0.9.5           ggplotify_0.1.2        
[91] farver_2.1.1            memoise_2.0.1           lifecycle_1.0.4        
[94] httr_1.4.7              GO.db_3.17.0            bit64_4.0.5            
[97] MASS_7.3-60.0.1        
Error in exists(cacheKey, where = .rs.WorkingDataEnv, inherits = FALSE) : 
  invalid first argument

enter image description here enter image description here

clusterProfiler annotation gseGO gsea OrgDb • 845 views
ADD COMMENT
1
Entering edit mode

Note the very low number of genes in the gene sets of the mouse analysis (see the Count circles)!

Based on this it seems that only very few gene symbols could be mapped to GO categories. In other words, it could be that most of your mouse input is being ignored. So double check that your mouse input are really SYMBOLS, and not for example ALIAS. Thus check manually that all of your input maps to ENTREZID (because this is the central identifier type used in org.Mm.eg.db, which in turn is used under the hood by the gseGO).

ADD REPLY

Login before adding your answer.

Traffic: 521 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6