Hello
I have made a multi-factorial design for a limma experiment working on mehtylation microarray and I would like to have some feedbacks to make sure that I am doing it properly. Here is the code and the explanations:
M value from the probes:
M <- read.table("M.txt",sep="\t",as.is=T,header=T,row.names=1)
The table tmp contains the information about the samples:
tmp <- read.table(« …..txt", sep="\t", header=T,as.is=T)
#Name Group Mutation
#H1 Asymptomatic WT
#H2 Disease_sub1 WT
#H3 Disease_sub2 WT
#H4 Disease_sub3 WT
#H5 Control WT
#H6 Disease_sub1 Mut
#H7 Disease_sub2 Mut
#H8 Disease_sub3 Mut
We have three groups of samples: The disease that can carry a mutation with 3 different subgroups, the asymptomatic that never carry this mutation and the control group that do not carry the mutation.
I would like to know which probes are differentially methylated due to the mutation so not taking into account the Controls/Asymptomatic that never carry it. A bit like this:
In the disease:
(Disease_sub1.WT-Disease_sub1.Mut) - (Disease_sub2.WT-Disease_sub2.Mut) - (Disease_sub3.WT-Disease_sub3.Mut)
and
By subgroup:
(Disease_sub1.WT-Disease_sub1.Mut) - (Disease_sub2.WT-Disease_sub2.Mut)
(Disease_sub2.WT-Disease_sub2.Mut) - (Disease_sub3.WT-Disease_sub3.Mut)
(Disease_sub1.WT-Disease_sub1.Mut) - (Disease_sub3.WT-Disease_sub3.Mut)
Questions:
1) Since my controls and my asymptomatic patients don’t have the mutation, should I remove them in the first lmFit()? Are they not going to biais the analysis?
2) To analyse the impact of the mutation in the patients with the disease can I do:
tmp <- tmp[!tmp$Group =="Control",]
tmp <- tmp[!tmp$Group =="Asymptomatic",]
TS <- paste(tmp$Group, tmp$Mutation, sep=".")
design <- model.matrix(~0+TS)
fit <- lmFit(M, design)
cont.matrix <- makeContrasts(
sub12=(Disease_sub1.WT-Disease_sub1.Mut) - (Disease_sub2.WT-Disease_sub2.Mut),
sub23=(Disease_sub2.WT-Disease_sub2.Mut) - (Disease_sub3.WT-Disease_sub3.Mut),
sub13=(Disease_sub1.WT-Disease_sub1.Mut) - (Disease_sub3.WT-Disease_sub3.Mut),
sub123=(Disease_sub1.WT-Disease_sub1.Mut) - (Disease_sub2.WT-Disease_sub2.Mut) - (Disease_sub3.WT-Disease_sub3.Mut),
levels=design)
fit2 <- contrasts.fit(fit, cont.matrix)
fit2 <- eBayes(fit2)
or I’m going already too far and for sub123 doing this would be enough:
tmp <- tmp[!tmp$Group =="Control",]
tmp <- tmp[!tmp$Group =="Asymptomatic",]
TS <- paste(tmp$Mutation, sep=".")
design <- model.matrix(~TS)
fit <- lmFit(M, design)
fit <- eBayes(fit)
topTable(fit)
Thanks in advance for your answers!
Are those 8 groups in the table, or are they individual patients? Are the WT/mut disease samples from the same patient? Otherwise, I don't see how you'll have enough residual d.f. if you're going to use a one-way layout.