I'm working on dataset with 100 samples run on Illumina Infinium HumanMethylation450 BeadChip array.
I've performed PCA on all samples using all the QC passed probes before and after normalization and tested for association between each PC and independent experimental variables like BCD batch (2 batches), Experiment batch (3 batches), Sentrix ID (9 chips), Sentrix position, cell components and sample groups using linear regression.
I find large amount of variation in my data due to BCD batch, Experiment batch and Sentrix ID even after normalization and I wish to adjust for these batch effects before I proceed to differential methylation analysis.
I've few questions about applying ComBat.mc (ENmix package) for the same.
1) Can I combine three batch variables into one factor as follows:
baseDir <- "E:/IDPP3_450K_PILOT/Pilot_Analysis_12-2-16/450KPilot_Edit"
targets <- read.metharray.sheet(baseDir)
beta <- read.csv("beta.csv", row.names=1, check.names=FALSE)
beta <- as.matrix(beta)
beta_combat<-ComBat.mc(beta, batchcom, nCores=8, mod=NULL)
2) Use ComBat.mc multiple times, adjusting for the first batch and then adjust for the second batch, and so on. If so, will the order affect the batch adjustment
beta_combat1<-ComBat.mc(beta, batch1, nCores=8, mod=NULL)
beta_combat2<-ComBat.mc(beta_combat1, batch2, nCores=8, mod=NULL)
beta_combat3<-ComBat.mc(beta_combat2, batch3, nCores=8, mod=NULL)
Should I be using combat from sva package instead of combat.mc from ENmix package?
3) Is using M-values preferred over beta values for these adjustments?
Thanks in advance for your help.