Hi all,
I’m analyzing Illumina 450K data of a study of following design
Subject Group Time-point Class
1 Control Baseline 1
1 Control Follow-up 2
2 Control Baseline 1
2 Control Follow-up 3
3 Intervention Baseline 1
3 Intervention Follow-up 2
4 Intervention Baseline 1
4 Intervention Follow-up 2
Individuals of a class 1 are recruited and randomly distributed to control and intervention groups.
At follow-up, individuals in both the groups now belong to two different classes: either 2 or 3
I have several variables like BCD batch, Chip and position, experiment batch, cell count estimates, age and BMI and I would like to adjust my analysis for these covariates.
There are two objectives:
- Identify the differentially methylated CpGs from baseline to follow-up in individuals belonging to different classes independently i.e class 1 vs 2, class 1 vs 3 and class 2 vs 3
- Perform an association analysis of beta values with other phenotypic information like fasting glucose or insulin
I’m attempted to use the following scripts for my analysis
## Regression analysis using lmfit limma for dmCpGs
### Unadjusted analysis
#### Pre-post:1 vs 2 analysis
targets_t2d <- read.csv("B:/IDPP3_450K_PILOT/targets.csv",check.names = FALSE)
SibShip <- factor(targets$SibShip)
Treat <- factor(targets$Treatment, levels=c("1","2"))
design <- model.matrix(~SibShip+Treat)
cleanbeta <- read.csv("B:/450K_PILOT/cleanbeta.csv",row.names=1,check.names=FALSE)
cleanbeta<-as.matrix(cleanbeta)
fit <- lmFit(cleanbeta, design)
fit <- eBayes(fit)
fit_fdr<-topTable(fit, coef="Treat2",adjust="fdr",number=416733)
## Regression analysis using lmfit limma for dmCpGs
### Adjusted for covariates
#### Pre-post:1 vs 2 analysis
targets_t2d <- read.csv("B:/450K_PILOT/targets.csv",check.names = FALSE)
SibShip <- factor(targets$SibShip)
Treat <- factor(targets$Treatment, levels=c("1","2"))
design <-model.matrix(~SibShip+Treat+targets$age+targets$BMI+targets$CD8T+targets$CD4T+targets$NK+targets$Bcell+targets$Mono+targets$Gran)
cleanbeta <- read.csv("B:/450K_PILOT/cleanbeta.csv",row.names=1,check.names=FALSE)
cleanbeta<-as.matrix(cleanbeta)
fitcelladjusted <- lmFit(cleanbeta, design)
fit_celladjusted <- eBayes(fitcelladjusted)
fit_cell_adjusted_fdr <-topTable(fit_celladjusted, coef="Treat2",adjust="fdr",number=416733)
I have the following queries:
1) Is it the right way to do the analysis for objective 1?
P.S: Since I have less sample size (~25 baseline-followup pairs per group), I choose to ignore the effect of group at this stage
2) At a later stage, how can I include the sample group information also in the model and go ahead with the analysis? I'm assuming that including time-point information is not necessary since it is reflected in the class
3) Can I use a mixed regression model like that of lmer function in lme4 package to include sample, group and time-point information as a random variables for objective 2?
4) Which model would be the right one for the above study design:
Beta,age and cell estimates as fixed effects
Subject,group as random effects, I'm not sure of timepoint if it a fixed/random effect, I'm supposing that it is a random effect variable
full.model1=lmer(Fastingglucose~beta+age+cellestimates+(1/subject)+(1/group)+(1/timepoint),REML=false)
full.model2=lmer(Fastingglucose~beta+age+cellestimates+(1/subject*timepoint)+(1/group),REML=false)
full.model3=lmer(Fastinglucose~beta+age+cellestimates+(1+beta/subject)+(1+beta/group)+(1+beta/timepoint), REML=false)
full.model4=lmer(Fastingglucose~beta+age+cellestimates+(1+beta/subject*timepoint)+(1+beta/group),REML=false)
null.model1=lmer(Fastingglucose~age+cellestimates+(1/subject)+(1/group), REML=false)
null.model2=lmer(Fastingglucose~age+cellestimates+(1/subject*timepoint)+(1/group),REML=false)
null.model3=lmer(Fastinglucose~age+cellestimates+(1+beta/subject)+(1+beta/group)+(1+beta/timepoint), REML=false)
null.model4=lmer(Fastingglucose~age+cellestimates+(1+beta/subject*timepoint)+(1+beta/group),REML=false)
Could you also please confirm if any of these above models would account for the paired nature of the sample?
I'm new to R-environment and your assistance would be of immense help.
Thanks for your attention.
Best regards,
Priyanka
Your limma code doesn't match up to your experimental design. What is
SibShip
(Subject, maybe)? What isTreatment
- the Class, or the Group?