Question

Methylation data analysis

0

Entering edit mode

aldea85 • 0

@b9dd486d

Last seen 2 days ago

France

Hi, I am new in this field (so basic questions will follow). I want to analyze methylation data published together with my own data. I have downloaded beta values from a 450K array (there was not IDAT data for it). They just uploaded the normalized beta values. Myself I have data from EPIC array which I normalized, etc. and I ended up with beta values normalized. I was able to merge and intersect probes between these two types of arrays. I generated a data.frame containing the CpG identificator and next to it the 18 columns (10 are normal + 8 pathological). I don't know what is the best way to analyze this matrix. I tried limma but got stuck making a contrast matrix, advices? bellow the code. But more importantly it is the best way to analyze my data? or should I take another approach. I'm worried that normalization is not the correct one or that interecting just this data.frame by CpG identificator is not the right one to do it. Thanks a lot!

normal_cols <- colnames(your_data)[2:11] patho_cols <- colnames(your_data)[12:19]

design <- model.matrix(~ 0 + factor(rep(c("Normal", "Patho"), each = length(normal_cols))), data = your_data)

colnames(design) <- make.names(colnames(design))

contrast.matrix <- makeContrasts(Patho - Normal, levels = design) +++Error in eval(ej, envir = levelsenv) : object 'Patho' not found+++

MethylationArrayData methylationArrayAnalysis • 665 views

ADD COMMENT • link 5 months ago aldea85 • 0

score 1 · Answer 1 · 2023-11-14

1

Entering edit mode

James W. MacDonald 65k

@james-w-macdonald-5106

Last seen 13 hours ago

United States

The colnames of your design are not 'Patho' and 'Normal'. You'll need to change that if you want makeContrasts to work correctly.

colnames(design) <- c("Normal","Patho")

You don't say, but if your data are either all Normal or all Patho, then the status and batch are completely confounded, and the analysis will not be able to distinguish between technical differences and biological differences. In my experience the technical differences will dominate and you are likely wasting your time. But ymmv.

ADD COMMENT • link 5 months ago James W. MacDonald 65k

0

Entering edit mode

Thanks for you help!!! I've been stuck for a while and yes, I definitely feel like I'm wasting my time. I haven't found a good tutorial to really solve my problem. Let me explain a little more. We generated an EPIC (patho) data composed of 8 samples for which we obtained the beta values (after normalization, preprocessing, etc. using minfi). We got 8 matrix with CpG identifier, beta, p.value. We want to compare this data set with a similar normal tissue which is 450K (normal). For this we have 10 matrix with CpG identifier, and beta. I don't know the best way to get the DMRs. As I mentioned I was able to generated a big table in which I merged the 18 tables with common CpG. Is this the best way to proceed? also my script is not working. I am getting zero DMR sites or getting errors, probably because of what you said. Thanks a lot again! If you know of a tutorial or video, I'd be happy to take a look! I'm enjoying this analysis at some point it gets frustrating.

normal_cols <- colnames(common_matrix)[2:11] patho_cols <- colnames(common_matrix)[12:19]

design <- model.matrix(~ 0 + factor(rep(c("Normal", "Patho"), each = length(normal_cols))), data = common_matrix)

colnames(design) <- c("Normal", "Patho")

contrast.matrix <- makeContrasts(Patho - Normal, levels = design)

fit <- lmFit(common_matrix, design)

fit <- eBayes(fit)

results <- decideTests(fit, contrast = contrast.matrix)

DE_sites <- rownames(results)[results[, 1] == "D"]

print(DE_sites)

ADD REPLY • link 5 months ago aldea85 • 0

1

Entering edit mode

To reiterate, you won't be able to say if the differences are due to biology or technical, and in my experience it's going to be mostly technical. But if you still want to go forward, what you are doing so far is fine. Except you probably don't want to use decideTests, but instead just topTable.

tt <- topTable(fit, 1, Inf, p.value = 0.05)
## dollars to donuts this is a large number
nrow(tt)

But do note that this will provide differentially methylated positions, not regions. You could use the DMRcate or DSS or minfi packages if you want regions.

ADD REPLY • link 5 months ago James W. MacDonald 65k

0

Entering edit mode

I've been struggling to find a solution for working with beta values in DNA methylation analysis. My goal is to identify Differentially Methylated Regions (DMRs) using packages like DMRcate or minfi. However, most scripts seem to be tailored for IDAT data, and there's limited guidance on incorporating beta values directly.

In essence, I'm unsure about how to seamlessly integrate beta values into scripts designed for IDAT data. I've encountered similar concerns in forums, but no answer so far.

I would greatly appreciate any advice, scripts, or insights. Anything would be greatly appreciated!!! Thank you for your help!

ADD REPLY • link 5 months ago aldea85 • 0

1

Entering edit mode

You just need to start the tutorial when the beta matrix is obtained :

minfi section 7 : https://www.bioconductor.org/help/course-materials/2015/BioC2015/methylation450k.html DMRcate at the cpg.annotate creation : https://bioconductor.org/packages/devel/bioc/vignettes/DMRcate/inst/doc/DMRcate.pdf

ADD REPLY • link 5 months ago Basti ▴ 780

0

Entering edit mode

Awesome! thanks a lot! I'll take a look at it! (and sorry for the naive question!)

ADD REPLY • link 5 months ago aldea85 • 0