I am new to methylation analysis and I am currently using sesame to do look at differential methylation in a human population with and without a treatment, the experiment was done on Illumina EpicV2.
summary = DML(se, ~Treatment, BPPARAM = BiocParallel::MulticoreParam(4))
test_result = summaryExtractTest(summary)
I was following Illumina's youtube series but some things have changed. In particular, I am looking at the top differentially methylated sites and would like to know what chromosome/region ,etc they are in, if I look at the top probes from test_result:
Probe_ID Est_X.Intercept. Est_TreatmentNegative
cg06025456 0.8529257 -0.6830048
cg13934406 0.7866247 -0.6451398
cg26861374 0.9101138 -0.6367773
Pval_X.Intercept. Pval_TreatmentNegative FPval_Treatment
6.708460e-04 0.002604867 0.002604867
5.992517e-05 0.000404208 0.000404208
2.802338e-02 0.097658895 0.097658895
Eff_Treatment
0.6830048
0.6451398
0.6367773
are there tools in sesame to then assign these to the corresponding chromosomes and genes? There seem to be some tools for visualizing but I don't know how to look at chromosomes or genes?
sessionInfo() R version 4.4.1 (2024-06-14) Platform: x86_64-pc-linux-gnu Running under: Ubuntu 22.04.4 LTS
Matrix products: default BLAS:
/usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so; LAPACK version 3.10.0locale: 1 LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=Ctime zone: Etc/UTC tzcode source: system (glibc)
attached base packages: 1 parallel stats4 stats graphics grDevices [6] utils datasets methods base
other attached packages: 1 pals_1.9
2 broom_1.0.6 [3] RColorBrewer_1.1-3 [4] limma_3.60.4
[5] shiny_1.9.1.9000 [6] IlluminaHumanMethylationEPICv2manifest_1.0.0 [7] IlluminaHumanMethylationEPICv2anno.20a1.hg38_1.0.0 [8] FlowSorted.Blood.EPIC_2.8.0 [9] knitr_1.47
[10] sesame_1.22.2 [11] sesameData_1.22.0 [12] ExperimentHub_2.12.0 [13] AnnotationHub_3.12.0 [14] BiocFileCache_2.12.0 [15] dbplyr_2.5.0
[16] grateful_0.2.4 [17] lubridate_1.9.3 [18] forcats_1.0.0 [19] stringr_1.5.1 [20] dplyr_1.1.4 [21] purrr_1.0.2
[22] readr_2.1.5 [23] tidyr_1.3.1 [24] tibble_3.2.1
[25] ggplot2_3.5.1 [26] tidyverse_2.0.0 [27] shinyMethyl_1.41.1 [28] minfi_1.50.0
[29] bumphunter_1.46.0 [30] locfit_1.5-9.10 [31] iterators_1.0.14 [32] foreach_1.5.2 [33] Biostrings_2.72.1 [34] XVector_0.44.0 [35] SummarizedExperiment_1.34.0 [36] Biobase_2.64.0 [37] MatrixGenerics_1.16.0 [38] matrixStats_1.3.0 [39] GenomicRanges_1.56.1 [40] GenomeInfoDb_1.40.1 [41] IRanges_2.38.1 [42] S4Vectors_0.42.1 [43] BiocGenerics_0.50.0loaded via a namespace (and not attached): 1 splines_4.4.1
later_1.3.2 [3] BiocIO_1.14.0 bitops_1.0-8 [5] filelock_1.0.3 preprocessCore_1.66.0 [7] XML_3.99-0.17 lifecycle_1.0.4 [9] lattice_0.22-6 MASS_7.3-61 [11] base64_2.0.1 scrime_1.3.5 [13] backports_1.5.0
magrittr_2.0.3 [15] rmarkdown_2.28 yaml_2.3.10 [17] httpuv_1.6.15 doRNG_1.8.6 [19] askpass_1.2.0 mapproj_1.2.11 [21] DBI_1.2.3
maps_3.4.2 [23] abind_1.4-5
zlibbioc_1.50.0 [25] quadprog_1.5-8
RCurl_1.98-1.16 [27] rappdirs_0.3.3
GenomeInfoDbData_1.2.12 [29] genefilter_1.86.0
annotate_1.82.0 [31] DelayedMatrixStats_1.26.0 codetools_0.2-20 [33] DelayedArray_0.30.1 xml2_1.3.6
[35] tidyselect_1.2.1 UCSC.utils_1.0.0 [37] beanplot_1.3.1 illuminaio_0.46.0 [39] GenomicAlignments_1.40.0 jsonlite_1.8.8 [41] wheatmap_0.2.0 multtest_2.60.0 [43] survival_3.7-0 tools_4.4.1 [45] Rcpp_1.0.13 glue_1.7.0 [47] SparseArray_1.4.8 xfun_0.45
[49] HDF5Array_1.32.1 withr_3.0.1 [51] BiocManager_1.30.24 fastmap_1.2.0 [53] rhdf5filters_1.16.0 fansi_1.0.6 [55] openssl_2.2.1 digest_0.6.37 [57] timechange_0.3.0 R6_2.5.1 [59] mime_0.12
colorspace_2.1-1 [61] dichromat_2.0-0.1
RSQLite_2.3.7 [63] utf8_1.2.4
generics_0.1.3 [65] data.table_1.16.0
rtracklayer_1.64.0 [67] httr_1.4.7
S4Arrays_1.4.1 [69] pkgconfig_2.0.3 gtable_0.3.5 [71] blob_1.2.4 siggenes_1.78.0 [73] htmltools_0.5.8.1 scales_1.3.0 [75] png_0.1-8
rstudioapi_0.16.0 [77] tzdb_0.4.0
reshape2_1.4.4 [79] rjson_0.2.22 nlme_3.1-165 [81] curl_5.2.2 cachem_1.1.0 [83] rhdf5_2.48.0 BiocVersion_3.19.1 [85] AnnotationDbi_1.66.0 restfulr_0.0.15 [87] GEOquery_2.72.0 pillar_1.9.0 [89] grid_4.4.1
reshape_0.8.9 [91] vctrs_0.6.5
promises_1.3.0 [93] xtable_1.8-4
evaluate_0.24.0 [95] GenomicFeatures_1.56.0 cli_3.6.3
[97] compiler_4.4.1 Rsamtools_2.20.0 [99] rlang_1.1.4 crayon_1.5.3 [101] rngtools_1.5.2 nor1mix_1.3-3 [103] mclust_6.1.1 plyr_1.8.9 [105] stringi_1.8.4
BiocParallel_1.38.0 [107] munsell_0.5.1 Matrix_1.7-0 [109] hms_1.1.3 sparseMatrixStats_1.16.0 [111] bit64_4.0.5 Rhdf5lib_1.26.0 [113] KEGGREST_1.44.1 statmod_1.5.0 [115] memoise_2.0.1 bit_4.0.5
zhouwanding I have a question regarding DML analysis. I understand the Est_ output is describing the difference in methylation between the treatment group and the reference (in this case the difference between negative treatment and the reference). In my analysis, I am interested in knowing what the actual methylation levels are at the sites that are differently methylated in treatment groups (low and high in my case) vs controls. Is there an easy way to get back to the methylation levels that are differentially methylated for each site? Or do I need to go back to the initial normalized betas output (where I used getBetas with the proper pre-processing) and then search for the significant cpg sites that came up during the DML analysis?