Question

Find genes in pathways enrichment

0

Entering edit mode

yoursbassanio ▴ 10

@yoursbassanio-12717

Last seen 5.4 years ago

Below is the command I used to run Dmrcate and enrichment analysis.

dmrCate<-cpg.annotate("array", BetaValue, what="Beta", arraytype ="EPIC", analysis.type="differential",fdr=0.01) 
dmrcoutput <- dmrcate(dmrCate,lambda=500, C=5, pcutoff="fdr",min.cpgs=1,mc.cores=1)
wgbs.ranges_all <- extractRanges(dmrcoutput, genome = "hg19")

enrichment_GO <- goregion(wgbs.ranges_all,collection = "GO", array.type = "EPIC") enrichment_GO <-
enrichment_GO[order(enrichment_GO$P.DE),]

I have got the following results too

> "ONTOLOGY"    "TERM"  "N" "DE"    "P.DE"  "FDR"
> "GO:0005654"  "CC"    "nucleoplasm"   3501    1911.66666666667    5.78302322878527e-28    1.31685221942669e-23
> "GO:0043231"  "CC"    "intracellular membrane-bounded
> organelle"    11367   5316.54285714286    8.60779986601549e-27    9.80041053745193e-23
> "GO:0043229"  "CC"    "intracellular
> organelle"    13162   6092.37619047619    8.98325926857439e-24    6.25634156984302e-20

Is there any ways I could fetch the genes that are used/involved to identify in each of these pathways?

sessionInfo()
R version 3.5.1 (2018-07-02)
Platform: x86_64-conda_cos6-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)

Matrix products: default
BLAS/LAPACK: /scratch/mv83/Software/DMRcate/lib/R/lib/libRblas.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
 [1] splines   stats4    parallel  stats     graphics  grDevices utils    
 [8] datasets  methods   base     

other attached packages:
 [1] IlluminaHumanMethylationEPICanno.ilm10b4.hg19_0.6.0
 [2] missMethyl_1.20.3                                  
 [3] celltypes450_0.1                                   
 [4] limma_3.36.5                                       
 [5] DMRcate_1.18.0                                     
 [6] DMRcatedata_1.16.0                                 
 [7] DSS_2.30.0                                         
 [8] bsseq_1.18.0                                       
 [9] minfi_1.28.0                                       
[10] bumphunter_1.24.5                                  
[11] locfit_1.5-9.1                                     
[12] iterators_1.0.10                                   
[13] foreach_1.4.4                                      
[14] Biostrings_2.50.2                                  
[15] XVector_0.22.0                                     
[16] SummarizedExperiment_1.12.0                        
[17] DelayedArray_0.8.0                                 
[18] BiocParallel_1.16.2                                
[19] matrixStats_0.54.0                                 
[20] Biobase_2.42.0                                     
[21] GenomicRanges_1.34.0                               
[22] GenomeInfoDb_1.18.1                                
[23] IRanges_2.16.0                                     
[24] S4Vectors_0.20.1                                   
[25] BiocGenerics_0.28.0                                

loaded via a namespace (and not attached):
  [1] backports_1.1.3                                   
  [2] Hmisc_4.1-1                                       
  [3] plyr_1.8.4                                        
  [4] lazyeval_0.2.2                                    
  [5] ggplot2_3.1.1                                     
  [6] digest_0.6.18                                     
  [7] ensembldb_2.6.3                                   
  [8] htmltools_0.3.6                                   
  [9] GO.db_3.7.0                                       
 [10] magrittr_1.5                                      
 [11] checkmate_1.9.1                                   
 [12] memoise_1.1.0                                     
 [13] BSgenome_1.50.0                                   
 [14] cluster_2.0.8                                     
 [15] readr_1.3.1                                       
 [16] annotate_1.60.0                                   
 [17] R.utils_2.8.0                                     
 [18] askpass_1.1                                       
 [19] siggenes_1.56.0                                   
 [20] prettyunits_1.0.2                                 
 [21] colorspace_1.4-1                                  
 [22] blob_1.1.1                                        
 [23] BiasedUrn_1.07                                    
 [24] xfun_0.6                                          
 [25] dplyr_0.8.0.1                                     
 [26] crayon_1.3.4                                      
 [27] RCurl_1.95-4.12                                   
 [28] genefilter_1.64.0                                 
 [29] GEOquery_2.50.0                                   
 [30] IlluminaHumanMethylationEPICmanifest_0.3.0        
 [31] survival_2.44-1.1                                 
 [32] VariantAnnotation_1.28.3                          
 [33] glue_1.3.1                                        
 [34] ruv_0.9.6                                         
 [35] registry_0.5                                      
 [36] gtable_0.3.0                                      
 [37] zlibbioc_1.28.0                                   
 [38] Rhdf5lib_1.4.3                                    
 [39] HDF5Array_1.10.1                                  
 [40] scales_1.0.0                                      
 [41] DBI_1.0.0                                         
 [42] rngtools_1.3.1                                    
 [43] bibtex_0.4.2                                      
 [44] Rcpp_1.0.1                                        
 [45] xtable_1.8-3                                      
 [46] progress_1.2.0                                    
 [47] htmlTable_1.13.1                                  
 [48] foreign_0.8-71                                    
 [49] bit_1.1-14                                        
 [50] mclust_5.4.3                                      
 [51] preprocessCore_1.44.0                             
 [52] Formula_1.2-3                                     
 [53] htmlwidgets_1.3                                   
 [54] httr_1.4.0                                        
 [55] RColorBrewer_1.1-2                                
 [56] acepack_1.4.1                                     
 [57] pkgconfig_2.0.2                                   
 [58] reshape_0.8.8                                     
 [59] XML_3.98-1.16                                     
 [60] R.methodsS3_1.7.1                                 
 [61] Gviz_1.26.4                                       
 [62] nnet_7.3-12                                       
 [63] tidyselect_0.2.5                                  
 [64] rlang_0.3.3                                       
 [65] AnnotationDbi_1.44.0                              
 [66] munsell_0.5.0                                     
 [67] tools_3.5.1                                       
 [68] RSQLite_2.1.1                                     
 [69] stringr_1.4.0                                     
 [70] org.Hs.eg.db_3.7.0                                
 [71] knitr_1.22                                        
 [72] bit64_0.9-7                                       
 [73] beanplot_1.2                                      
 [74] methylumi_2.28.0                                  
 [75] purrr_0.3.2                                       
 [76] AnnotationFilter_1.6.0                            
 [77] nlme_3.1-137                                      
 [78] doRNG_1.7.1                                       
 [79] nor1mix_1.2-3                                     
 [80] R.oo_1.22.0                                       
 [81] xml2_1.2.0                                        
 [82] biomaRt_2.38.0                                    
 [83] compiler_3.5.1                                    
 [84] rstudioapi_0.10                                   
 [85] curl_3.3                                          
 [86] statmod_1.4.30                                    
 [87] tibble_2.1.1                                      
 [88] stringi_1.4.3                                     
 [89] IlluminaHumanMethylation450kanno.ilmn12.hg19_0.6.0
 [90] GenomicFeatures_1.34.1                            
 [91] lattice_0.20-38                                   
 [92] ProtGenerics_1.14.0                               
 [93] Matrix_1.2-17                                     
 [94] IlluminaHumanMethylation450kmanifest_0.4.0        
 [95] permute_0.9-5                                     
 [96] multtest_2.38.0                                   
 [97] pillar_1.3.1                                      
 [98] data.table_1.12.2                                 
 [99] bitops_1.0-6                                      
[100] rtracklayer_1.42.2                                
[101] R6_2.4.0                                          
[102] latticeExtra_0.6-28                               
[103] gridExtra_2.3                                     
[104] codetools_0.2-16                                  
[105] dichromat_2.0-0                                   
[106] MASS_7.3-51.3                                     
[107] gtools_3.8.1                                      
[108] assertthat_0.2.1                                  
[109] rhdf5_2.26.2                                      
[110] openssl_1.3                                       
[111] pkgmaker_0.27                                     
[112] withr_2.1.2                                       
[113] GenomicAlignments_1.18.1                          
[114] Rsamtools_1.34.0                                  
[115] GenomeInfoDbData_1.1.0                            
[116] hms_0.4.2                                         
[117] quadprog_1.5-7                                    
[118] grid_3.5.1                                        
[119] rpart_4.1-13                                      
[120] tidyr_0.8.3                                       
[121] base64_2.0                                        
[122] DelayedMatrixStats_1.4.0                          
[123] illuminaio_0.24.0                                 
[124] biovizBase_1.30.1                                 
[125] base64enc_0.1-3

dmrcate enrichment analysis epic array goregion missmethyl • 2.1k views

ADD COMMENT • link updated 4.5 years ago by Jovana Maksimovic ▴ 20 • written 5.5 years ago by yoursbassanio ▴ 10

0

Entering edit mode

Hi,

Can one please help me on this please. I could provide more information if needed

ADD REPLY • link 5.5 years ago yoursbassanio ▴ 10

0

Entering edit mode

Hi Bassanio,

I'm noticing that your object is called wgbs_ranges_all, yet you specify arraytype="EPIC" in your call to goregion(). Your enrichment terms won't be correct because goregion() assumes a background of CpGs from the EPIC or 450K arrays, not from the entire human genome. To my knowledge there isn't a routine that corrects for varying numbers of CpGs mapped to genes for WGBS, but if you're happy to retain that bias and simply assume that each gene in the genome has an equal chance of being differentially methylated, then perhaps something like the RITAN package to find term enrichment may be sufficient:

library(RITAN)
library(RITANdata)
dmgenes <- unique(unlist(lapply(wgbs_ranges_all$overlapping.genes, function (x) strsplit(x, ", "))))
enrichment <- term_enrichment(dmgenes, resources=names(geneset_list), report_resources_separately = TRUE)

Note that RITAN will, as you requested, report the genes involved in the results.

Cheers, Tim

ADD REPLY • link 5.5 years ago Tim Peters ▴ 230

0

Entering edit mode

Hi,

My data is from Epic array and not bi-sulphate sequencing.

ADD REPLY • link 5.4 years ago yoursbassanio ▴ 10

0

Entering edit mode

Hi Bassanio,

Looking at your code properly I can see that now. Your naming convention threw me though - why would you name the GRanges object wgbs.ranges_all if it came from EPIC? Nevermind.

You should be able to get the gene names this way:

1) Get your subset of CpGs from your DMRs using Jim's method from your previous post: https://support.bioconductor.org/p/125350/ - let's call this character vector sigCpGs.

2) Use missMethyl::getMappedEntrezIDs() to get your Entrez IDs for the relevant genes:

entrezids <- getMappedEntrezIDs(sig.cpg = sigCpGs, all.cpg = rownames(BetaValue), array.type = "EPIC")$sig.eg

3) Use biomaRt to convert to hgnc:

library(biomaRt)
ensembl <- useMart("ensembl",dataset="hsapiens_gene_ensembl")
bm <- getBM(attributes=c('entrezgene_id', 'hgnc_symbol'), filters = 'entrezgene_id', values = entrezids, mart = ensembl)
mygenes <- bm$hgnc_symbol

Presto.

Good luck, Tim

ADD REPLY • link 5.4 years ago Tim Peters ▴ 230

0

Entering edit mode

Hi,

My data is from Epic array and not bi-sulphate sequencing.

ADD REPLY • link 5.4 years ago yoursbassanio ▴ 10

score 0 · Answer 1 · 2021-07-12

Hi,

We recently implemented a parameter in gometh, gsameth, goregion and gsaregion, sig.genes, allowing you to output an extra column in the results table showing the genes in the gene set that overlap the DM CpGs or CpGs underlying your regions. It is set to FALSE by default.

Ensure that you have the latest version of missMethyl installed to be able to use it.

Cheers, Jovana