I have reviewed the LIMMA manual (esp. sections 9.6.1 and 9.6.2), and need conceptual help in applying either this approach to my experimental design or suggestions to use an alternative approach.
My microarray RNA data is as follows: Two tissue types, five time points (0, 6, 24, 72, 168 hours), ~20 subjects, with each subject having up to 3 samples collected for each tissue at various time points (IRB restrictions). A wound is made and the time zero collection is the "baseline" expression profile for each tissue.
I wanted to ask the following questions but can not wrap my head around the best approach in terms of setting up the design matrix. Questions:
1) How do the tissues independently respond to wounding over time?
I could compare each tissue after wounding to it's baseline time 0hr doing various contrasts (M6-M0, M24-M0, M72-M0, etc), but the issue I find here is that the two tissues are completely different at time 0hr. This means my "control" or "baseline" that I'm comparing subsequent time points to is different between the two tissues and that I cannot really compare the results (like an intersect or overlap) between the two tissues. Would it make sense to group M0 and C0 as a control group?
Another approach is to compare changes over time using these contrasts (M6-M0, M24-M6, M72-M24, etc.) for each tissue. This leads to four comparisons for each tissue (8 total).
2) How do the two tissues differ in their response to wounding over time?
For question two I thought I could compare each tissue at each time point using contrasts (M0-C0, M6-C6, M24-C24, etc), but this is also a lot of comparisons (4) so I thought a spline based approach to model general changes over time would be more tractable. The positive aspect of the spline is I get changes in expression over time that follow different trends between tissues in one comparison. The downside is that the only way (I can think of) to group genes changing in the same manner over time is to cluster the significant results after adj.P.Val and lfc cutoff. Please see below:
pData$X <- ns(pData$time, df=3) #Setup model matrix and rename columns ":" to "." design <- model.matrix(~0+tissue+tissue:X, eset) colnames(design)  "tissueC" "tissueM" "tissueC.X1" "tissueM.X1" "tissueC.X2" "tissueM.X2"  "tissueC.X3" "tissueM.X3" # Estimate the correlation between measurements made on the same subject: corfit <- duplicateCorrelation(eset,design,block=eset$subject) corfit$consensus # Fit the model fit <- lmFit(eset,design,block=pData$subject,correlation=corfit$consensus) fit2 <- eBayes(fit) table1 <- topTable(fit2, coef=c(3:8), number=Inf, adjust.method="BH", sort.by="none")
3) I'm wondering how different is the table1 output (conceptually) than the table2 output from the code below in terms of what it is reporting?
cont.mat = makeContrasts(tissueM.X3-tissueC.X3, tissueM.X2-tissueC.X2, tissueM.X1-tissueC.X1, levels = design) fit.cont =contrasts.fit(fit, cont.mat) fit.cont=eBayes(fit.cont) table2 <- topTable(fit.cont, number = Inf, adjust.method = "BH", sort.by = "none")
4) In order to look for genes changing over time in tissue M is the code below correct?
tableM <- topTable(fit2, coef=c(4, 6, 8), number = Inf, adjust.method = "BH", sort.by = "none")
5) In order to look for genes changing over time in tissue C is the code below correct?
tableC <- topTable(fit2, coef=c(3, 5, 7), number = Inf, adjust.method = "BH", sort.by = "none")