Dear all,
we want to use the limma package for the analysis of methylation data from the 450K Illumina
array for continuous traits.
We have 50 samples which were processed on the 450k array. The design only consist cases
without any controls. We have more than 5 covariates we want to include into the analysis apart
from age and ethnicity.
Could you give a broad idea/ R commands how to build a model and adjust for the above covariates using limma? We will use normalized and filtered M-values for the statistical analysis.
ID | GroupAge | Ethnicity | GroupSurgery | GroupSeizures | GroupDrug | GroupTreatment | Group5 | Group6 | Group7 |
1 | 1 | CEU | 1 | 2 | 1 | 3 | .. | .. | .. |
2 | 1 | CEU | 2 | 3 | 3 | 3 | .. | .. | .. |
3 | 2 | CEU | 1 | 2 | 2 | 3 | .. | .. | .. |
4 | 1 | CEU | 1 | 2 | 1 | 3 | .. | .. | .. |
5 | 3 | AFR | 3 | 1 | 1 | 2 | .. | .. | .. |
6 | 1 | ASN | 2 | 1 | 2 | 3 | .. | .. | .. |
7 | 1 | ASN | 1 | 2 | 2 | 3 | .. | .. | .. |
8 | 1 | CEU | 3 | 3 | 2 | 2 | .. | .. | .. |
9 | 2 | MEX | 1 | 1 | 1 | 1 | .. | .. | .. |
This is how I would do it in the first place. Would be happy if you can have a look at. Im very thankful for any corrections and suggestions for designing the analysis.
samplesheet <- read.table("samplesheet.txt",row.names="Sample_ID")
age <- as.numeric(samplesheet[colnames(m),]$GroupAge)
ethnicity <- as.factor(samplesheet[colnames(m),]$Ethnicity)
group1 <- as.numeric(samplesheet[colnames(m),]$GroupSurgery)
group2 <- as.numeric(samplesheet[colnames(m),]$GroupSeizures)
group3 <- as.numeric(samplesheet[colnames(m),]$GroupDrug)
group4 <- as.numeric(samplesheet[colnames(m),]$GroupTreatment)
library(limma)
design <- model.matrix(~age+ethnicity+group1+group2+group3+group4)
fit <- lmFit(m.filtered,design)
fit <- eBayes(fit)
topTable(fit, coef="age")
topTable(fit, coef="ethnicity")
toTable(fit, coef="group1")
...
Thank you very much in advance.