Search
Question: RMS question in MEDIPS
0
4 months ago by
enrique.yeh0 wrote:

Hello everyone,

I'm trying to use the tool MEDIPS to find the RMS value of each genes.

There are 6 samples (GD16003-GD16008), 2 of them are hyper-methylated (GD16005 & GD16008) and comfirmed in RPKM value, but these two samples are not having the highest value in RMS value.

Does anyone knows why the hyper-methylated samples not having the highest RMS value?

Mean value are listed below.

 CpG rpkm rms CpG GD16003 0.676719 0.2192527 CpG GD16004 0.526604 0.202631 CpG GD16005 1.492834 0.1860266 CpG GD16006 0.328434 0.1620868 CpG GD16007 0.395263 0.233723 CpG GD16008 1.40204 0.1974412 Gene rpkm rms Gene GD16003 0.949448 0.4457013 Gene GD16004 0.958066 0.4601055 Gene GD16005 1.179833 0.4229319 Gene GD16006 0.887107 0.4300184 Gene GD16007 0.803318 0.4130587 Gene GD16008 1.131659 0.428664

And here's the Rscript,


library(BSgenome)
library(MEDIPS)
library(BSgenome.Mmusculus.UCSC.mm10)
BSgenome="BSgenome.Mmusculus.UCSC.mm10"
uniq=1e-3
extend=300
shift=0
ws=300
prcBamFile <- "GD16007.bam"
Input <- MEDIPS.createSet(file=prcBamFile, BSgenome=BSgenome, extend=extend, shift=shift, uniq=uniq, window_size=ws)
CS = MEDIPS.couplingVector(pattern="CG", refObj=Input)
mr.edgeR <- MEDIPS.meth(MSet1 = Input, CSet = CS, p.adj="bonferroni", diff.method="edgeR", CNV=FALSE, MeDIP=TRUE, minRowSum=10, diffnorm="tmm")
mr.edgeR.roi <- MEDIPS.selectROIs(results=mr.edgeR, rois=ROI, columns=NULL, summarize="avg")
write.table(mr.edgeR.roi, file="GD16007_result_roiGene.tmp", quote=FALSE, sep="\t", row.names=FALSE, col.names=TRUE)


modified 4 months ago by Lukas Chavez470 • written 4 months ago by enrique.yeh0
0
4 months ago by
Lukas Chavez470
USA/La Jolla/UCSD
Lukas Chavez470 wrote:

Hi enrique.yeh,

thanks for sharing these results! I agree that rms values of the hyper-methylated samples should reflect the elevated signals you see by the rpkm values. Instead of looking at the mean values only, I am wondering how the genome wide distribution/ histogram of rms values compare between the samples? If they are similar across the samples including the hyper-methytlated samples, then its really only due to the way MEDIPS shifts the data range of rms values (which are basically nothing else than the observed/ expected ratio) into the interval [0,1]. While MEDIPS will make the rpkm values of genomic regions with different CpG densities more comparable within a sample (--> relative methylation values), transformation into absolute methylation values (such as 20% or 80% methylation) is conceptually weak. Therefore, we have developed a much better R/Bioconductor tool qsea (http://bioconductor.org/packages/release/bioc/html/qsea.html), which does a much better job. It would be great, if you can apply qsea to your data and let us know, if it works on your data!

Please note, MEDIPS performs differential coverage analysis based on count data (by emplying edgeR) and does not depend on rms values.

All the best,

Lukas