gsameth GSA: probe annotation to multiple genes
0
0
Entering edit mode
nurkenber • 0
@bba1982e
Last seen 5 months ago
Kazakhstan

I have performed gene set analysis (GSA) using gsameth from missMethyl package for EPICv2 array. I noticed that the probes annotated to more than one gene gets completely excluded from the analysis. I found the reason and corrected it but was curios if it is isolated case. The problem lies in one of the subfunctions of gsameth.

In gsameth => getMappedEntrezIDs => .getFlatAnnotation

# This is the way i tested it
Anno <- getAnnotation(IlluminaHumanMethylationEPICv2anno.20a1.hg38)
flat_test <- .getFlatAnnotation(array.type = "EPIC_V2", anno = Anno) 
> head(rownames(flat_test))
[1] "cg25324105_BC111" "cg25383568_TC111" "cg25623721_TC111" "cg25898577_BC11"  "cg25908985_BC11"  "cg25910443_TC111"

# And this is the line where the problem is located within the .getFlatAnnotation
flat <- data.frame(symbol = unlist(geneslist), group = unlist(grouplist))

This results in inaccurate transformation of list into dataframe for probes with multiple genes. Probes change from cg25324105_BC11 to cg25324105_BC111

Then I decided to change it to

  flat <- data.frame(
    rowname = rep(names(geneslist), lengths(geneslist)), 
    symbol = unlist(geneslist), 
    group = unlist(grouplist))

> head(flat$rowname)
[1] "cg00381604_BC11" "cg00381604_BC11" "cg00381604_BC11" "cg00381604_BC11" "cg00381604_BC11" "cg21870274_BC21"
# And in subsequent lines I changed rownames(flat) to flat$rowname
# sessionInfo( )
R Under development (unstable) (2024-12-20 r87452 ucrt)
Platform: x86_64-w64-mingw32/x64
Running under: Windows 10 x64 (build 19045)

Matrix products: default


attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] tibble_3.2.1                                        ggplot2_3.5.1                                      
 [3] stringr_1.5.1                                       rtracklayer_1.67.0                                 
 [5] org.Hs.eg.db_3.20.0                                 AnnotationDbi_1.69.0                               
 [7] qusage_2.41.0                                       missMethyl_1.41.0                                  
 [9] IlluminaHumanMethylationEPICanno.ilm10b4.hg19_0.6.0 IlluminaHumanMethylation450kanno.ilmn12.hg19_0.6.1 
[11] DMRcatedata_2.25.0                                  ExperimentHub_2.15.0                               
[13] AnnotationHub_3.15.0                                BiocFileCache_2.15.0                               
[15] dbplyr_2.5.0                                        DMRcate_3.3.1                                      
[17] limma_3.63.2                                        readxl_1.4.3                                       
[19] readr_2.1.5                                         dplyr_1.1.4                                        
[21] data.table_1.16.4                                   IlluminaHumanMethylationEPICv2anno.20a1.hg38_1.0.0 
[23] IlluminaHumanMethylationEPICv2manifest_1.0.0        minfi_1.53.1                                       
[25] bumphunter_1.49.0                                   locfit_1.5-9.10                                    
[27] iterators_1.0.14                                    foreach_1.5.2                                      
[29] Biostrings_2.75.3                                   XVector_0.47.0                                     
[31] SummarizedExperiment_1.37.0                         Biobase_2.67.0                                     
[33] MatrixGenerics_1.19.0                               matrixStats_1.4.1                                  
[35] GenomicRanges_1.59.1                                GenomeInfoDb_1.43.2                                
[37] IRanges_2.41.2                                      S4Vectors_0.45.2                                   
[39] BiocGenerics_0.53.3                                 generics_0.1.3                                     

loaded via a namespace (and not attached):
  [1] ProtGenerics_1.39.1       bitops_1.0-9              httr_1.4.7                RColorBrewer_1.1-3        tools_4.5.0              
  [6] doRNG_1.8.6               backports_1.5.0           R6_2.5.1                  HDF5Array_1.35.2          lazyeval_0.2.2           
 [11] Gviz_1.51.0               rhdf5filters_1.19.0       permute_0.9-7             withr_3.0.2               prettyunits_1.2.0        
 [16] gridExtra_2.3             base64_2.0.2              preprocessCore_1.69.0     cli_3.6.3                 labeling_0.4.3           
 [21] mvtnorm_1.3-2             genefilter_1.89.0         tidytable_0.11.2          askpass_1.2.1             Rsamtools_2.23.1         
 [26] foreign_0.8-87            siggenes_1.81.0           illuminaio_0.49.0         R.utils_2.12.3            rentrez_1.2.3            
 [31] dichromat_2.0-0.1         scrime_1.3.5              BSgenome_1.75.0           rstudioapi_0.17.1         RSQLite_2.3.9            
 [36] BiocIO_1.17.1             gtools_3.9.5              Matrix_1.7-1              interp_1.1-6              abind_1.4-8              
 [41] R.methodsS3_1.8.2         lifecycle_1.0.4           yaml_2.3.10               edgeR_4.5.1               rhdf5_2.51.1             
 [46] SparseArray_1.7.2         grid_4.5.0                blob_1.2.4                crayon_1.5.3              lattice_0.22-6           
 [51] beachmat_2.23.5           GenomicFeatures_1.59.1    annotate_1.85.0           KEGGREST_1.47.0           pillar_1.10.0            
 [56] knitr_1.49                beanplot_1.3.1            rjson_0.2.23              fftw_1.0-9                estimability_1.5.1       
 [61] codetools_0.2-20          glue_1.8.0                remotes_2.5.0             vctrs_0.6.5               png_0.1-8                
 [66] cellranger_1.1.0          gtable_0.3.6              cachem_1.1.0              xfun_0.49                 S4Arrays_1.7.1           
 [71] mime_0.12                 survival_3.8-3            statmod_1.5.0             nlme_3.1-166              bit64_4.5.2              
 [76] bsseq_1.43.1              progress_1.2.3            filelock_1.0.3            nor1mix_1.3-3             rpart_4.1.23             
 [81] colorspace_2.1-1          DBI_1.2.3                 Hmisc_5.2-1               nnet_7.3-19               tidyselect_1.2.1         
 [86] emmeans_1.10.6            bit_4.5.0.1               compiler_4.5.0            curl_6.0.1                httr2_1.0.7              
 [91] htmlTable_2.4.3           BiasedUrn_2.0.12          xml2_1.3.6                DelayedArray_0.33.3       checkmate_2.3.2          
 [96] scales_1.3.0              quadprog_1.5-8            rappdirs_0.3.3            digest_0.6.37             rmarkdown_2.29           
[101] GEOquery_2.75.0           htmltools_0.5.8.1         pkgconfig_2.0.3           jpeg_0.1-10               base64enc_0.1-3          
[106] sparseMatrixStats_1.19.0  fastmap_1.2.0             ensembldb_2.31.0          rlang_1.1.4               htmlwidgets_1.6.4        
[111] UCSC.utils_1.3.0          DelayedMatrixStats_1.29.0 farver_2.1.2              jsonlite_1.8.9            BiocParallel_1.41.0      
[116] mclust_6.1.1              R.oo_1.27.0               VariantAnnotation_1.53.0  RCurl_1.98-1.16           magrittr_2.0.3           
[121] Formula_1.2-5             GenomeInfoDbData_1.2.13   Rhdf5lib_1.29.0           munsell_0.5.1             Rcpp_1.0.13-1            
[126] stringi_1.8.4             zlibbioc_1.53.0           MASS_7.3-61               plyr_1.8.9                deldir_2.0-4             
[131] splines_4.5.0             multtest_2.63.0           hms_1.1.3                 rngtools_1.5.2            biomaRt_2.63.0           
[136] BiocVersion_3.21.1        XML_3.99-0.17             evaluate_1.0.1            latticeExtra_0.6-30       biovizBase_1.55.0        
[141] BiocManager_1.30.25       tzdb_0.4.0                tidyr_1.3.1               openssl_2.3.0             purrr_1.0.2              
[146] reshape_0.8.9             xtable_1.8-4              restfulr_0.0.15           AnnotationFilter_1.31.0   memoise_2.0.1            
[151] GenomicAlignments_1.43.0  cluster_2.1.8
missMethyl gsameth • 295 views
ADD COMMENT

Login before adding your answer.

Traffic: 1049 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6