Hi Limma experts, This is my first time using Limma and I am struggling to determine which contrasts to construct to answer my research questions/ if I have chosen the correct design matrix to most effectively answer my questions. The two questions I have:
- Is there a treatment effect?
- Is there a temporal effect?
My data and study design: Treatment (control, low, high), and Time (day minus 5, day 5, day 15, and day 30). The same pool of subjects are repeatedly sampled at each time point within each treatment, but they are sampled independently of each other (there is no way to know if it is the same individuals are being sampled at each time point). It's important to note that the experiment started on day 0, so day minus 5 (m5) is a temporal reference, prior to exposure.
I have decided to use model.matrix(~0 + group) after referencing Law (2020), and constructed groups: C_m5, C_5, C_15, C_30, L_m5, L_5, L_15, L_30, H_m5, H_5, H_15, H_30. However, there are a few other options I believe could be more useful for my case including model.matrix(~treatment*time) or model.matrix(~0+treatment+treatment:time), but I do not understand the workings of contrasts well enough to know which is best.
For my current model I chose (model.matrix(~0 + group)), I created a contrast matrix with all pairwise comparisons between groups... which looks pretty disturbing as you'll see below. I'm very unsure if this is the most effective way to make the contrasts, or if i would be better off making contrasts in another fashion, such as using interactions between treatment and time with one of the other models.
I'll Include the code of what I have thus far for limma, it's pretty ugly with all of the contrasts, which makes me feel it is wrong. I am hoping that someone is able to provide some feedback on what i have done, and if there is a different direction I should be taking. Further, with the results from decideTests, I am unable to make a vennDiagram, I will show the error below, but i am assuming it is because of how I made the contrasts, so it is not picking up my two factors (treatment and time).
Thank you in advanced for taking the time to help me today!
group <- interaction(treatment, time)
mm <- model.matrix(~0 + group)
fit <- lmFit(log_mucus_cL, mm)
#all pairwise comparisons...
cm <- makeContrasts(groupC._5vsgroupH._5 = groupH._5 - groupC._5,
groupC._5vsgroupL._5 = groupL._5 - groupC._5,
groupC._5vsgroupC.15 = groupC.15 - groupC._5,
groupC._5vsgroupH.15 = groupH.15 - groupC._5,
groupC._5vsgroupL.15 = groupL.15 - groupC._5,
groupC._5vsgroupC.30 = groupC.30 - groupC._5,
groupC._5vsgroupH.30 = groupH.30 - groupC._5,
groupC._5vsgroupL.30 = groupL.30 - groupC._5,
groupC._5vsgroupC.m5 = groupC._5 - groupC.m5,
groupC._5vsgroupH.m5 = groupC._5 - groupH.m5,
groupC._5vsgroupL.m5 = groupC._5 - groupL.m5,
groupH._5vsgroupL._5 = groupH._5 - groupL._5,
groupH._5vsgroupC.15 = groupH._5 - groupC.15,
groupH._5vsgroupH.15 = groupH.15 - groupH._5,
groupH._5vsgroupL.15 = groupH._5 - groupL.15,
groupH._5vsgroupC.30 = groupH._5 - groupC.30,
groupH._5vsgroupH.30 = groupH.30 - groupH._5,
groupH._5vsgroupL.30 = groupH._5 - groupL.30,
groupH._5vsgroupC.m5 = groupH._5 - groupC.m5,
groupH._5vsgroupH.m5 = groupH._5 - groupH.m5,
groupH._5vsgroupL.m5 = groupH._5 - groupL.m5,
groupL._5vsgroupC.15 = groupL._5 - groupC.15,
groupL._5vsgroupH.15 = groupH.15 - groupL._5,
groupL._5vsgroupL.15 = groupL.15 - groupL._5,
groupL._5vsgroupC.30 = groupL._5 - groupC.30,
groupL._5vsgroupH.30 = groupH.30 - groupL._5,
groupL._5vsgroupL.30 = groupL.30 - groupL._5,
groupL._5vsgroupC.m5 = groupL._5 - groupC.m5,
groupL._5vsgroupH.m5 = groupL._5 - groupH.m5,
groupL._5vsgroupL.m5 = groupL._5 - groupL.m5,
groupC.15vsgroupH.15 = groupC.15 - groupH.15,
groupC.15vsgroupL.15 = groupC.15 - groupL.15,
groupC.15vsgroupC.30 = groupC.30 - groupC.15,
groupC.15vsgroupH.30 = groupH.30 - groupC.15,
groupC.15vsgroupL.30 = groupL.30 - groupC.15,
groupC.15vsgroupC.m5 = groupC.15 - groupC.m5,
groupC.15vsgroupH.m5 = groupC.15 - groupH.m5,
groupC.15vsgroupL.m5 = groupC.15 - groupL.m5,
groupH.15vsgroupL.15 = groupH.15 - groupL.15,
groupH.15vsgroupC.30 = groupH.15 - groupC.30,
groupH.15vsgroupH.30 = groupH.30 - groupH.15,
groupH.15vsgroupL.30 = groupH.15 - groupL.30,
groupH.15vsgroupC.m5 = groupH.15 - groupC.m5,
groupH.15vsgroupH.m5 = groupH.15 - groupH.m5,
groupH.15vsgroupL.m5 = groupH.15 - groupL.m5,
groupL.15vsgroupC.30 = groupL.15 - groupC.30,
groupL.15vsgroupH.30 = groupH.30 - groupL.15,
groupL.15vsgroupL.30 = groupL.30 - groupL.15,
groupL.15vsgroupC.m5 = groupL.15 - groupC.m5,
groupL.15vsgroupH.m5 = groupL.15 - groupH.m5,
groupL.15vsgroupL.m5 = groupL.15 - groupL.m5,
groupC.30vsgroupH.30 = groupH.30 - groupC.30,
groupC.30vsgroupL.30 = groupL.30 - groupC.30,
groupC.30vsgroupC.m5 = groupC.30 - groupC.m5,
groupC.30vsgroupH.m5 = groupC.30 - groupH.m5,
groupC.30vsgroupL.m5 = groupC.30 - groupL.m5,
groupH.30vsgroupL.30 = groupH.30 - groupL.30,
groupH.30vsgroupC.m5 = groupH.30 - groupC.m5,
groupH.30vsgroupH.m5 = groupH.30 - groupH.m5,
groupH.30vsgroupL.m5 = groupH.30 - groupL.m5,
groupL.30vsgroupC.m5 = groupL.30 - groupC.m5,
groupL.30vsgroupH.m5 = groupL.30 - groupH.m5,
groupL.30vsgroupL.m5 = groupL.30 - groupL.m5,
groupC.m5vsgroupH.m5 = groupC.m5 - groupH.m5,
groupC.m5vsgroupL.m5 = groupC.m5 - groupL.m5,
groupH.m5vsgroupL.m5 = groupH.m5 - groupL.m5,
levels = mm)
# Fit the contrasts
fit2 <- contrasts.fit(fit, contrasts = cm)
# Calculate the t-statistics for the contrasts
fit2 <- eBayes(fit2)
# Summarize results
results <- decideTests(fit2)
summary(results)
library(VennDiagram)
vennDiagram(results)
error messages
#Error in matrix(0, noutcomes, ncontrasts) :
#invalid 'nrow' value (too large or NA)
sessionInfo( )
Hi Gordon, thanks for taking time to respond to my question. Yep, I'm trying to do the latter for my analysis. I know how I want my scientific analysis to look, so my question is how to construct the contrasts to answer those questions, and if I was using the right model to do it effectively. As I said, I just started using this package, the knowledge gap here is in generating the contrasts to answer my question. In my mind, testing for treatment differences at each time would look like this, which is 12 tests instead of the 4 you stated, so clearly I am missing something about contrasts:
Further, this doesn't test for temporal differences within treatments, which would be further tests. Perhaps there is an example of writing contrasts in a similar matter that you can direct me to?
Again, thank you for your time.
To test for DE at day -5 you could use:
That yields one anova test for each gene. You could choose any pair of contrasts at the same time to get the same test results.
There are many equivalent ways to setup the design matrix and contrasts for the same purpose, but this way is an easy as any and has the advantage of being explicit.
Hi Gordon,
I was hoping for clarification on your comment, I don't understand what is happening exactly in your script above.
Does this mean there is an anova test comparing the mean expression of each gene for each contrast specified (LvsC_m5 and HvsC_m5)? So when you include two or more coefficients in the topTable function, an anova is conducted for each gene between each coefficient?
I'll provide an example of what I'm trying to figure out: I am trying to understand if there is a temporal and treatment effect in the high treatment relative to the control on day 30 vs day 5. I put the two contrasts in the topTable below, but I just don't know if this format is the correct one to answer my question.
So, if my understand is correct, the results of "d30vsd5" show the top DE genes when comparing both treatment (high vs control), and time (day 30 vs day 5)?
Cheers.
No, that isn't what "anova" means. anova means that a single test is done for all possible comparisons between the three treatment groups. This is described in the limma documentation.
In my example I provided in the most recent comment (d30vsd5), is the result of this topTable showing the top DE genes from both the H30vsH_5 and C30vsC_5 contrasts? If not, what is the topTable showing?
Cheers