Hello, I have performed a proteomic array and I wanted to analyze it with the limma package. The design of the experiment is a bit complicated and I wanted to make sure that my approach is correct. My dataset compromises data from 18 subjects in 3 time-points. Half of the subjects were treated with a low dose of a treatment and the other half were treated with a high dose. Therefore, my data looks like:
Patient Time Treat_group
2 Time1 Lowdose
2 Time2 Lowdose
2 Time3 Lowdose
3 Time1 Lowdose
3 Time2 Lowdose
3 Time3 Lowdose
4 Time1 Lowdose
4 Time2 Lowdose
4 Time3 Lowdose
5 Time1 Highdose
5 Time2 Highdose
5 Time3 Highdose
6 Time1 Highdose
6 Time2 Highdose
6 Time3 Highdose
7 Time1 Highdose
7 Time2 Highdose
7 Time3 Highdose
I would like to identify proteins that change in expression over the time in both treatments and also proteins that change differentially over time in the high dose in comparison to the low dose, always taking into account the samples that are from the same patient. First, I have followed the section 9.7 multi-level experiments from the limmat manual in order to do pairwise comparisons:
Treat <- factor(paste(targets$Time,targets$Treat_group,sep="."))
design <- model.matrix(~0+Treat)
corfit <- duplicateCorrelation(y,design,block=targets$Patient)
fit <- lmFit(y,design,block=targets$Patient,correlation=corfit$consensus)
cm <- makeContrasts(
HighvsLowfortime1 = Time1.Highdose-Time1.Lowdose,
HighvsLowfortime2= Time2.Highdose-Time2.Lowdose,
HighvsLowfortime3= Time3.Highdose-Time3.Lowdose,
Time1vsTime2forlow= Time1.Lowdose-Time2.Lowdose,
Time1vsTime3forlow= Time1.Lowdose-Time3.Lowdose,
Time2vsTime3forlow= Time2.Lowdose-Time3.Lowdose,
Time1vsTime2forhigh= Time1.Highdose-Time2.Highdose,
Time1vsTime3forhigh= Time1.Highdose-Time3.Highdose,
Time2vsTime3forhigh= Time2.Highdose-Time3.Highdose,
levels=design)
fit2 <- contrasts.fit(fit, cm)
fit2 <- eBayes(fit2)
topTable(fit2, coef="HighvsLowfortime1", adjust="BH") #The same with all the comparisons
In order to see which proteins changed over time in the low dose and then the high dose I have created the following contrast matrix and then calculated the F-test between the 3 times in each group separately:
#LOW dose
cm1 <- makeContrasts(
Time1vsTime2forlow= Time2.Lowdose-Time1.Lowdose,
Time2vsTime3forlow= Time3.Lowdose-Time2.Lowdose,
Time1vsTime3forlow= Time3.Lowdose-Time1.Lowdose,
levels=design)
fit2 <- contrasts.fit(fit, cm1)
fit2 <- eBayes(fit2)
topTable(fit2)
#HIGH DOSIS
cm2 <- makeContrasts(
Time1vsTime2forhigh= Time2.Highdose-Time1.Highdose,
Time2vsTime3forhigh= Time3.Highdose-Time2.Highdose,
Time1vsTime3forhigh= Time3.Highdose-Time1.Highdose,
levels=design)
fit2 <- contrasts.fit(fit, cm2)
fit2 <- eBayes(fit2)
topTable(fit2)
Finally, to evaluate the proteins that change diferentially over time between the high dose and the low dose I have tried two designs and I am not sure which is better as the results are a bit different:
# First option: Continue with a multi-level experiment approach, create a contrast matrix with the dose comparisons in each time and then calculate the F-statistics:
cm3 <- makeContrasts(
HighvsLowfortime1 = Time1.Highdose-Time1.Lowdose,
HighvsLowfortime2= Time2.Highdose-Time2.Lowdose,
HighvsLowfortime3= Time3.Highdose-Time3.Lowdose,
levels=design)
fit2 <- contrasts.fit(fit, cm3)
fit2 <- eBayes(fit2)
topTable(fit2)
#Second option: Try the approach of the section 9.6.2 Many time points from the limma user manual treating the Patient as a random effect.
library(splines)
Group <- as.factor(targets$Treat_group) #Here I have no specified the freedom degrees, I think that as a default with my data df=1, I have tried to increased the df to 3 as the limma manual recommends but I obtaine the follow error in R when I build the lm. I am not sure if this is because the limited number of time points I have:
#Coefficients not estimable: X3 GroupLowdose:X3
#Warning message:
Partial NA coefficients for 438 probe(s) )
design <- model.matrix(~Group*X)
corfit <- duplicateCorrelation(y,design,block=targets$Patient) #Estimem la correlaci? entre mesures fetes en el mateix pacient
corfit$consensus
fit <- lmFit(y,design,block=targets$Patient,correlation=corfit$consensus)
fit <- eBayes(fit)
topTable(fit, coef=4) #I am not sure if the coefficient I have to select is number 4 but is the one from interaction
I hope anyone can help me solving my doubts and specially, confirming that I am analyzing the data correctly. Alternatively, if you think another strategy would be better let me know. I hope the code is clear but if something is not let me know,
Thank you very much for your help and sorry for the long post,