Looking for equivalent EPIC chip v1 vs. v2 annotation variables
1
1
Entering edit mode
Courtney ▴ 10
@e556e97d
Last seen 23 months ago
United States

Hello,

I am conducting an EWAS of methylation data from human adult whole blood samples. The data are from the Illumina EPIC chip v2, and I'm using the IlluminaHumanMethylationEPICv2anno.20a1.hg38 package from Zuguang Gu (Thanks, again!) to annotate the reults.

There are some variables from the EPIC v1 annotation (IlluminaHumanMethylationEPICanno.ilm10b4.hg19) which I like to pull for downstream analysis but are not included in v2 annotation, including UCSC_RefGene_Group and GencodeBasicV12_Group.

I'm wondering if there are equivalent/similar variables in the v2 annotation or, if not, a way to obtain these?

Thank you very much!

Courtney


sessionInfo()
R version 4.3.1 (2023-06-16 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19044)

Matrix products: default


locale:
[1] LC_COLLATE=English_United States.utf8 
[2] LC_CTYPE=English_United States.utf8   
[3] LC_MONETARY=English_United States.utf8
[4] LC_NUMERIC=C                          
[5] LC_TIME=English_United States.utf8    

time zone: America/New_York
tzcode source: internal

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils    
[7] datasets  methods   base     

other attached packages:
 [1] IlluminaHumanMethylationEPICanno.ilm10b4.hg19_0.6.0
 [2] sandwich_3.0-2                                     
 [3] sfsmisc_1.1-16                                     
 [4] lmtest_0.9-40                                      
 [5] zoo_1.8-12                                         
 [6] MASS_7.3-60                                        
 [7] janitor_2.2.0                                      
 [8] shinyMethyl_1.36.1                                 
 [9] methylclock_1.6.0                                  
[10] quadprog_1.5-8                                     
[11] devtools_2.4.5                                     
[12] usethis_2.2.2                                      
[13] methylclockData_1.8.1                              
[14] futile.logger_1.4.3                                
[15] wateRmelon_2.6.0                                   
[16] illuminaio_0.42.0                                  
[17] IlluminaHumanMethylation450kanno.ilmn12.hg19_0.6.1 
[18] ROC_1.76.0                                         
[19] lumi_2.52.0                                        
[20] methylumi_2.46.0                                   
[21] FDb.InfiniumMethylation.hg19_2.2.0                 
[22] org.Hs.eg.db_3.17.0                                
[23] TxDb.Hsapiens.UCSC.hg19.knownGene_3.2.2            
[24] GenomicFeatures_1.52.2                             
[25] AnnotationDbi_1.62.2                               
[26] reshape2_1.4.4                                     
[27] scales_1.2.1                                       
[28] limma_3.56.2                                       
[29] sesame_1.18.4                                      
[30] sesameData_1.18.0                                  
[31] FlowSorted.BloodExtended.EPIC_1.1.1                
[32] FlowSorted.Blood.EPIC_2.4.2                        
[33] IlluminaHumanMethylationEPICv2anno.20a1.hg38_0.99.0
[34] IlluminaHumanMethylationEPICv2manifest_0.99.1      
[35] ExperimentHub_2.8.1                                
[36] AnnotationHub_3.8.0                                
[37] BiocFileCache_2.8.0                                
[38] dbplyr_2.3.3                                       
[39] data.table_1.14.8                                  
[40] minfi_1.46.0                                       
[41] bumphunter_1.42.0                                  
[42] locfit_1.5-9.8                                     
[43] iterators_1.0.14                                   
[44] foreach_1.5.2                                      
[45] Biostrings_2.68.1                                  
[46] XVector_0.40.0                                     
[47] SummarizedExperiment_1.30.2                        
[48] Biobase_2.60.0                                     
[49] MatrixGenerics_1.12.3                              
[50] matrixStats_1.0.0                                  
[51] GenomicRanges_1.52.0                               
[52] GenomeInfoDb_1.36.4                                
[53] IRanges_2.34.1                                     
[54] S4Vectors_0.38.1                                   
[55] BiocGenerics_0.46.0                                
[56] pwr_1.3-0                                          
[57] readxl_1.4.3                                       
[58] lubridate_1.9.2                                    
[59] forcats_1.0.0                                      
[60] stringr_1.5.0                                      
[61] dplyr_1.1.2                                        
[62] purrr_1.0.2                                        
[63] readr_2.1.4                                        
[64] tidyr_1.3.0                                        
[65] tibble_3.2.1                                       
[66] ggplot2_3.4.4                                      
[67] tidyverse_2.0.0                                    

loaded via a namespace (and not attached):
  [1] fs_1.6.3                     
  [2] bitops_1.0-7                 
  [3] httr_1.4.7                   
  [4] RColorBrewer_1.1-3           
  [5] dynamicTreeCut_1.63-1        
  [6] backports_1.4.1              
  [7] profvis_0.3.8                
  [8] tools_4.3.1                  
  [9] doRNG_1.8.6                  
 [10] utf8_1.2.3                   
 [11] R6_2.5.1                     
 [12] HDF5Array_1.28.1             
 [13] mgcv_1.9-0                   
 [14] rhdf5filters_1.12.1          
 [15] urlchecker_1.0.1             
 [16] withr_2.5.0                  
 [17] gridExtra_2.3                
 [18] prettyunits_1.1.1            
 [19] base64_2.0.1                 
 [20] preprocessCore_1.62.1        
 [21] quantreg_5.97                
 [22] cli_3.6.1                    
 [23] pacman_0.5.1                 
 [24] formatR_1.14                 
 [25] AnnotationHubData_1.30.0     
 [26] genefilter_1.82.1            
 [27] askpass_1.2.0                
 [28] Rsamtools_2.16.0             
 [29] siggenes_1.74.0              
 [30] stringdist_0.9.10            
 [31] AnnotationForge_1.42.2       
 [32] sessioninfo_1.2.2            
 [33] scrime_1.3.5                 
 [34] impute_1.74.1                
 [35] rstudioapi_0.15.0            
 [36] RSQLite_2.3.1                
 [37] generics_0.1.3               
 [38] BiocIO_1.10.0                
 [39] car_3.1-2                    
 [40] Matrix_1.6-1                 
 [41] fansi_1.0.4                  
 [42] abind_1.4-5                  
 [43] lifecycle_1.0.3              
 [44] yaml_2.3.7                   
 [45] snakecase_0.11.1             
 [46] ggpmisc_0.5.4-1              
 [47] carData_3.0-5                
 [48] rhdf5_2.44.0                 
 [49] biocViews_1.68.2             
 [50] grid_4.3.1                   
 [51] blob_1.2.4                   
 [52] promises_1.2.1               
 [53] crayon_1.5.2                 
 [54] miniUI_0.1.1.1               
 [55] lattice_0.21-8               
 [56] annotate_1.78.0              
 [57] KEGGREST_1.40.1              
 [58] pillar_1.9.0                 
 [59] knitr_1.45                   
 [60] beanplot_1.3.1               
 [61] rjson_0.2.21                 
 [62] codetools_0.2-19             
 [63] glue_1.6.2                   
 [64] remotes_2.4.2.1              
 [65] ExperimentHubData_1.26.1     
 [66] vctrs_0.6.3                  
 [67] png_0.1-8                    
 [68] cellranger_1.1.0             
 [69] gtable_0.3.4                 
 [70] ggpp_0.5.4                   
 [71] cachem_1.0.8                 
 [72] xfun_0.40                    
 [73] S4Arrays_1.0.6               
 [74] mime_0.12                    
 [75] survival_3.5-7               
 [76] interactiveDisplayBase_1.38.0
 [77] ellipsis_0.3.2               
 [78] nlme_3.1-163                 
 [79] xts_0.13.1                   
 [80] bit64_4.0.5                  
 [81] progress_1.2.2               
 [82] filelock_1.0.2               
 [83] nor1mix_1.3-0                
 [84] affyio_1.70.0                
 [85] KernSmooth_2.23-22           
 [86] colorspace_2.1-0             
 [87] DBI_1.1.3                    
 [88] tidyselect_1.2.0             
 [89] processx_3.8.2               
 [90] bit_4.0.5                    
 [91] compiler_4.3.1               
 [92] curl_5.0.2                   
 [93] graph_1.78.0                 
 [94] BiocCheck_1.36.1             
 [95] SparseM_1.81                 
 [96] xml2_1.3.5                   
 [97] RPMM_1.25                    
 [98] DelayedArray_0.26.7          
 [99] rtracklayer_1.60.1           
[100] affy_1.78.2                  
[101] RBGL_1.76.0                  
[102] callr_3.7.3                  
[103] rappdirs_0.3.3               
[104] digest_0.6.33                
[105] rmarkdown_2.24               
[106] GEOquery_2.68.0              
[107] htmltools_0.5.6              
[108] pkgconfig_2.0.3              
[109] sparseMatrixStats_1.12.2     
[110] fastmap_1.1.1                
[111] htmlwidgets_1.6.2            
[112] rlang_1.1.1                  
[113] shiny_1.7.5                  
[114] DelayedMatrixStats_1.22.6    
[115] jsonlite_1.8.7               
[116] BiocParallel_1.34.2          
[117] mclust_6.0.0                 
[118] wheatmap_0.2.0               
[119] RCurl_1.98-1.12              
[120] magrittr_2.0.3               
[121] polynom_1.4-1                
[122] GenomeInfoDbData_1.2.10      
[123] Rhdf5lib_1.22.1              
[124] munsell_0.5.0                
[125] Rcpp_1.0.11                  
[126] stringi_1.7.12               
[127] nleqslv_3.3.4                
[128] PerformanceAnalytics_2.0.4   
[129] zlibbioc_1.46.0              
[130] plyr_1.8.8                   
[131] pkgbuild_1.4.2               
[132] splines_4.3.1                
[133] multtest_2.56.0              
[134] hms_1.1.3                    
[135] ps_1.7.5                     
[136] ggpubr_0.6.0                 
[137] RUnit_0.4.32                 
[138] ggsignif_0.6.4               
[139] rngtools_1.5.2               
[140] pkgload_1.3.2.1              
[141] biomaRt_2.56.1               
[142] futile.options_1.0.1         
[143] BiocVersion_3.17.1           
[144] XML_3.99-0.14                
[145] evaluate_0.21                
[146] lambda.r_1.2.4               
[147] BiocManager_1.30.22          
[148] tzdb_0.4.0                   
[149] httpuv_1.6.11                
[150] MatrixModels_0.5-2           
[151] openssl_2.1.0                
[152] reshape_0.8.9                
[153] broom_1.0.5                  
[154] xtable_1.8-4                 
[155] restfulr_0.0.15              
[156] rstatix_0.7.2                
[157] later_1.3.1                  
[158] OrganismDbi_1.42.0           
[159] memoise_2.0.1                
[160] GenomicAlignments_1.36.0     
[161] cluster_2.1.4                
[162] writexl_1.4.2                
[163] timechange_0.2.0
sesame IlluminaHumanMethylationEPICv2anno.20a1.hg38 methylationEPICv2.0 EpicV2 • 2.2k views
ADD COMMENT
0
Entering edit mode

Hello,

Related question to this post. How could I combine arrays EPIC v1 and v2? I generate mVals matrix tables by separate for the EPIC v1 and v2 data sets that I have. However, now I am not sure how to continue to match the different probes names.

Thanks a lot!

ADD REPLY
1
Entering edit mode

For what's worth, sesame has a liftOver function that maps EPICv1 to v2 or backward. https://github.com/zwdzwd/sesame/blob/2d5c2ab371430a8ecb1b5f09792457505a59d192/R/impute.R#L44 Hope it's helpful.

ADD REPLY
0
Entering edit mode

Thanks a lot! I'll take a look at it!

ADD REPLY
0
Entering edit mode

Hello again! Kind of related question, I read the Sesame manual but I could not find a way to plot the beta value index. I am refering to number 1 to 0 and the color that is associated. Thanks a lot and sorry for the naive question!

ADD REPLY
1
Entering edit mode
@james-w-macdonald-5106
Last seen 20 hours ago
United States

I believe all that information comes from the EPIC data from Illumina.

> z <- read.csv("EPIC-8v2-0_A1.csv", skip = 7)
> names(z)
 [1] "IlmnID"                       "Name"                         "AddressA_ID"                 
 [4] "AlleleA_ProbeSeq"             "AddressB_ID"                  "AlleleB_ProbeSeq"            
 [7] "Next_Base"                    "Color_Channel"                "col"                         
[10] "Probe_Type"                   "Strand_FR"                    "Strand_TB"                   
[13] "Strand_CO"                    "Infinium_Design"              "Infinium_Design_Type"        
[16] "CHR"                          "MAPINFO"                      "Species"                     
[19] "Genome_Build"                 "Source_Seq"                   "Forward_Sequence"            
[22] "Top_Sequence"                 "Rep_Num"                      "UCSC_RefGene_Group"          
[25] "UCSC_RefGene_Name"            "UCSC_RefGene_Accession"       "UCSC_CpG_Islands_Name"       
[28] "Relation_to_UCSC_CpG_Island"  "GencodeV41_Group"             "GencodeV41_Name"             
[31] "GencodeV41_Accession"         "Phantom5_Enhancers"           "HMM_Island"                  
[34] "Regulatory_Feature_Name"      "Regulatory_Feature_Group"     "X450k_Enhancer"              
[37] "DMR"                          "DNase_Hypersensitivity_NAME"  "Encode_CisReg_Site"          
[40] "Encode_CisReg_Site_Evid"      "OpenChromatin_NAME"           "OpenChromatin_Evidence_Count"
[43] "Methyl450_Loci"               "Methyl27_Loci"                "EPICv1_Loci"                 
[46] "Manifest_probe_match"         "SNP_ID"                       "SNP_DISTANCE"                
[49] "SNP_MinorAlleleFrequency"    
> head(z[,c(1,24)])
           IlmnID          UCSC_RefGene_Group
1 cg25324105_BC11 TSS200;TSS200;TSS200;TSS200
2 cg25383568_TC11             exon_18;exon_18
3 cg25455143_BC11                            
4 cg25459778_BC11                            
5 cg25487775_BC11                            
6 cg25595446_BC11

At which point it should be a simple matter of matching the IlmnID from that csv to your data.

ADD COMMENT

Login before adding your answer.

Traffic: 1060 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6