Question: should i include PCs or MDSs as covariates in methylation study?
gravatar for ZoeChing
2.0 years ago by
ZoeChing0 wrote:

Hi there. i'm analyzing a HM450K array methylation data with 2 groups(group1 vs group2. I wanna to compare the two groups in order to find differentially methylated probes(DMPs), so, I ran limma analysis on this dataset.

But when i discuss the pipeline with someone who engages in methylation data analysis in a bioinfo company, he told me that he add the MDS1, MDS2 from multidimensional scaling as covariates in find DMPs.

We have gwas data of those samples. Two groups were mixed together in the PCA plot based on gwas data. What puzzles me is the MDS1&2 he included in covariates were generated from methylation data. I thought if we wanna adjust the potential effect of population stratification, we should use the PCs from gwas data. Some previous studies use the MDS or PCA plot to testify whether methylation data can provide strong signatures to the target condition. So I worry about adding the MDS1, MDS2 as covariates may decrease the difference in DNA methylation between two groups. And i ran the pipeline twice, the difference in the two pipeline is covariates. Probes were considered to be differentially methylated if P-value after BH adjusted < 0.05.

    1st: covariates:  age + gender + array;   differentially methylated probes(DMPs): 152,688 probes

    2nd: covariates:  age + gender + array+MDS1+MDS2(from methylation data) ;   DMPs:  48,046 probes

    Two result differs greatly. We got far fewer DMPs in the second time, as i expected. I used the 1st pipeline in analysis, and i'm wondering is that correct?

So, i still confused about this question.  Any suggestion will be great appreciated!



limma methylation R mds covariate • 486 views
ADD COMMENTlink modified 2.0 years ago by Aaron Lun23k • written 2.0 years ago by ZoeChing0
Answer: should i include PCs or MDSs as covariates in methylation study?
gravatar for Aaron Lun
2.0 years ago by
Aaron Lun23k
Cambridge, United Kingdom
Aaron Lun23k wrote:

If you have strong differences between groups in your data, the first two MDS dimensions will correspond to the differences between groups. Using them as covariates will obviously reduce power to reject the null hypothesis, because genuine differences between groups are modelled by the covariates under the null model. If you want to empirically account for population structure, you're better off using methods like RUVnormalize or sva. You could also use PCs obtained from the GWAS data, but this will only account for variation due to genotype.

ADD COMMENTlink modified 2.0 years ago • written 2.0 years ago by Aaron Lun23k

Hi Aaron. There is no difference in age, or gender between groups. All of our samples are from the same population, and PCA plot of gwas data show no population stratification in our samples. So, i thought first two MDS dimensions may correspond to the phenotype of our interest.

I'm gonna remove MDS1,2 from covariates in analysis.  Thanks for your reply. :)


ADD REPLYlink modified 2.0 years ago • written 2.0 years ago by ZoeChing0
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 386 users visited in the last hour