Hi,
I'm analysing a longitudinal gene expression qPCR array study in LIMMA. I have 2 groups with baseline and week 24 measurements. I also have 2 confounding baseline continuous variables that I would like to include in the model.
I found something similar discussed here: https://stat.ethz.ch/pipermail/bioconductor/2013-October/055651.html
But I'm not sure if their confounding variable CES is of the same nature as mine, i.e. only a baseline measure. Also this type of analysis is different to many of the less complex studies I've done previously - so I thought I'd check.
Nevertheless I've conducted the analysis as follows and would like to make sure that I'm on the right path.
design<-model.matrix(~0+time:treatment+confound1+confound2)
colnames(design)[3:6]<-c("t1wk0","t1wk24","t2wk0", "t2wk24")
#produces a design matrix that looks like:
confound1 confound2 t1wk0 t1wk24 t1wk0 t2wk24
1 4.611723 5.1 1 0 0 0
2 5.752048 4.9 1 0 0 0
3 3.763428 4.9 1 0 0 0
4 5.328380 5.4 1 0 0 0
5 5.419956 5.1 1 0 0 0
6 4.970347 4.7 0 0 1 0
7 3.857935 4.7 1 0 0 0
8 4.927370 3.3 1 0 0 0
9 5.875061 4.1 1 0 0 0
10 4.580925 4.9 0 0 1 0
11 4.442480 4.3 0 0 1 0
12 4.338456 6.3 0 0 1 0
13 4.611723 5.1 0 1 0 0
14 5.752048 4.9 0 1 0 0
15 3.763428 4.9 0 1 0 0
16 5.328380 5.4 0 1 0 0
17 5.419956 5.1 0 1 0 0
18 4.970347 4.7 0 0 0 1
19 3.857935 4.7 0 1 0 0
20 4.927370 3.3 0 1 0 0
21 5.875061 4.1 0 1 0 0
22 4.580925 4.9 0 0 0 1
23 4.442480 4.3 0 0 0 1
24 4.338456 6.3 0 0 0 1
corfit <- duplicateCorrelation(eset,design,block=patient)
corfit$consensus
#where patient is a categorical variable denoting which patient is which#
fit_adj<-lmFit(eset,design,block=patient,correlation=corfit$consensus)
fit_adj<-eBayes(fit_adj)
#Then pull out the comparisons of interests with specific contrasts:
con.t1<-makeContrasts(t1wk24-t1wk0, levels=design)
con.t2<-makeContrasts(t2wk24-t2wk0, levels=design)
con.t1vt2<-makeContrasts(((t1wk24-t1wk0)-(t2wk24-t2wk0)), levels=design)
#etc
I think this is method is doing okay since the results are making sense and are similar to previous analyses. But I'm concerned that the value of the baseline confounding variables will be repeating in the model for both wk0 and wk24 arrays for a patient, will this interfere with the analysis?
Thanks
What is the patient blocking factor with respect to the other terms? It's hard to know whether it'll interfere or not if it's unspecified.
It's a vector containing the Patient IDs as factors. There's 12 patients with 2 arrays each. You'll notice that the confound1 values repeat at array 13, this is the start of the wk24 arrays.