Question

Methylation EPIC annotation - the logFC values represented as boxplots

0

Entering edit mode

jonellevillar • 0

@jonellevillar-15405

Last seen 7 months ago

Bergen

Dear Friends,

I am working with microarray data looking at the effect of three antipsychotic drugs on methylation patterns. I ran the results of my model from limma in IlluminaHumanMethylationEPICanno.ilm10b2.hg19 to get the genes associated with the probes.

results$gene = anno.epic$UCSC_RefGene_Name

My question pertains to the logFC values of the probes. These values were then visualized in box plots comparing the logFC of the three different drugs. The values range from -20 to 28, and yet the box plots are all quite similar. I have inserted a jpg of the screen shot.

Second question about the use of topTreat instead of topTable if that makes any difference?

contrast.matrix = makeContrasts(Olan = Ola, levels = mod)

fitContrasts = contrasts.fit(fit,contrast.matrix)
eb = treat(fitContrasts)
results = topTreat(eb,adjust="fdr",number=length(eb$coefficients))

Thanks for your help.

Jonelle Villar, RN

> sessionInfo()
R version 3.4.3 (2017-11-30)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] limma_3.34.9                                        IlluminaHumanMethylationEPICanno.ilm10b2.hg19_0.6.0
 [3] minfi_1.24.0                                        bumphunter_1.20.0                                  
 [5] locfit_1.5-9.1                                      iterators_1.0.9                                    
 [7] foreach_1.4.4                                       Biostrings_2.46.0                                  
 [9] XVector_0.18.0                                      SummarizedExperiment_1.8.1                         
[11] DelayedArray_0.4.1                                  matrixStats_0.53.1                                 
[13] Biobase_2.38.0                                      GenomicRanges_1.30.3                               
[15] GenomeInfoDb_1.14.0                                 IRanges_2.12.0                                     
[17] S4Vectors_0.16.0                                    BiocGenerics_0.24.0                                
[19] reshape_0.8.7                                       ggplot2_2.2.1                                      

loaded via a namespace (and not attached):
 [1] httr_1.3.1               tidyr_0.8.0              nor1mix_1.2-3            RMySQL_0.10.14          
 [5] bit64_0.9-7              splines_3.4.3            assertthat_0.2.0         doRNG_1.6.6             
 [9] blob_1.1.1               GenomeInfoDbData_1.0.0   Rsamtools_1.30.0         yaml_2.1.18             
[13] progress_1.1.2           pillar_1.2.1             RSQLite_2.1.0            lattice_0.20-35         
[17] glue_1.2.0               quadprog_1.5-5           digest_0.6.15            RColorBrewer_1.1-2      
[21] colorspace_1.3-2         preprocessCore_1.40.0    Matrix_1.2-12            plyr_1.8.4              
[25] GEOquery_2.46.15         pkgconfig_2.0.1          siggenes_1.52.0          XML_3.98-1.10           
[29] biomaRt_2.34.2           genefilter_1.60.0        zlibbioc_1.24.0          purrr_0.2.4             
[33] xtable_1.8-2             scales_0.5.0             BiocParallel_1.12.0      annotate_1.56.2         
[37] tibble_1.4.2             openssl_1.0.1            beanplot_1.2             pkgmaker_0.22           
[41] GenomicFeatures_1.30.3   lazyeval_0.2.1           survival_2.41-3          magrittr_1.5            
[45] mclust_5.4               memoise_1.1.0            nlme_3.1-131.1           MASS_7.3-49             
[49] xml2_1.2.0               data.table_1.10.4-3      tools_3.4.3              registry_0.5            
[53] prettyunits_1.0.2        hms_0.4.2                stringr_1.3.0            munsell_0.4.3           
[57] rngtools_1.2.4           bindrcpp_0.2             AnnotationDbi_1.40.0     base64_2.0              
[61] compiler_3.4.3           rlang_0.2.0              grid_3.4.3               RCurl_1.95-4.10         
[65] bitops_1.0-6             gtable_0.2.0             codetools_0.2-15         multtest_2.34.0         
[69] DBI_0.8                  R6_2.2.2                 illuminaio_0.20.0        GenomicAlignments_1.14.2
[73] dplyr_0.7.4              rtracklayer_1.38.3       bit_1.1-12               bindr_0.1.1             
[77] readr_1.1.1              stringi_1.1.7            Rcpp_0.12.16

>

probe to gene id logFC illuminahumanmethylationepicanno.ilm10b2.hg19 limma contrast matrix • 1.7k views

ADD COMMENT • link updated 6.9 years ago by James W. MacDonald 68k • written 6.9 years ago by jonellevillar • 0

score 0 · Answer 1 · 2018-04-03

0

Entering edit mode

James W. MacDonald 68k

@james-w-macdonald-5106

Last seen 1 day ago

United States

There are both statistical and biological rationales for using something like minfi or DSS to analyze methylation data, so you might consider looking at the vignettes for those packages.

If you really want to use limma to do the analysis, you should ensure that you are using M-values (logit transform of the methylation proportions) rather than the methylation proportions themselves. You don't show code, so it's impossible for anybody to know what you are doing, but log fold changes as large as you report aren't really consistent with what one would normally expect.

ADD COMMENT • link 6.9 years ago James W. MacDonald 68k

0

Entering edit mode

As far as I know, DSS is only for sequencing data. It doesn't work on methylation microarrays.

The missMethyl package is designed for Illumina methylation microarrays. I think minfi and missMethyl both call out to limma.

ADD REPLY • link 6.9 years ago Gordon Smyth 52k

0

Entering edit mode

Thank you for the suggestion about minfi and missMethyl. Before proceeding, I do have a limma question. I am looking at differential methylation patterns in subjects taking three different antipsychotic drugs. Today I tried a sample of 160 cases and while the logFC values were better than from yesterday's post, the p-values are seemingly absurd. I am posting the R code and results below. (AP1 pertains to the drugs). Any suggestions or ideas about source of possible error?

##model####
mod = model.matrix(~0 + AP1 + AgeAtSamplingDate + factor(Gender) + factor(smoker..) +
Array + Platte, data = target)

##contrast matrix####
contrast.matrix <- makeContrasts(Olanzapine = Ola, Quetiapine = Que, Aripiprazole = Ari, levels = mod)

## fit model####
fit = lmFit(data, mod)

fitContrasts <- contrasts.fit(fit,contrasts=contrast.matrix)
eb = treat(fitContrasts)
results = topTreat(eb,adjust="fdr",number=length(eb$coefficients))
head(results)

> head(results)
               logFC   AveExpr         t       P.Value     adj.P.Val
cg23526353 -4.980876 -4.908205 -77.46641 3.672367e-118 2.849841e-112
cg25732282 -5.071258 -5.112430 -76.87088 1.072504e-117 4.161439e-112
cg27533872 -5.295036 -5.223324 -75.75520 8.160570e-117 1.931234e-111
cg27527736 -4.967852 -5.013292 -75.64678 9.954519e-117 1.931234e-111
cg03235761 -4.857537 -4.830610 -74.13935 1.622920e-115 2.518847e-110
cg05219517 -5.180994 -5.137122 -73.49487 5.441894e-115 7.038391e-110

Kind regards.

Jonelle Villar, RN

ADD REPLY • link 6.9 years ago jonellevillar • 0

1

Entering edit mode

A contrast compares two groups and tests for a difference in methylation between the groups. What you are doing is testing to see if the average methylation for each of your different groups is different from zero, which in most cases it will be.

The limma User's Guide covers linear models and contrasts pretty thoroughly. If you are planning to use that package, it's in your best interest to read the guide that comes with it.

ADD REPLY • link 6.9 years ago James W. MacDonald 68k