Hello everyone, Following reading the DESeq2 vignette and multiple posts in this forum, I still need to post my question regarding analysis design. In the experiment run by my colleagues, there are three factors:
Species (A, B)
Treat (Mock (M), Inoculation (I))
Time (T1,T2,T3,T4,T5,T6,T7,T8)
Each of the two Species has both Treats at each Time points and replicated between 1 to 4 times.
The Questions are:
What genes are differentially expressed for each of the two Species at each of the eight Time points when Inoculation compared to Mock? For example:
AIT5 vs AMT5
What genes are differentially expressed when the two Species are compared to each other at each Time points? This can be achieved in two ways:
Comparing the Inoculations only: AIT5 vs BIT5 Subtracting the Mocks (Interactions): (AIT5 - AMT5) vs (BIT5 - BMT5)
In doing so, what would be best full and reduced models? If I combine the factors first, like paste(Species, Treat, Time, sep="") giving SpeciesTreat_Time, the full l model would be as following:
dds <- DESeqDataSetFromMatrix(countData=data, colData=meta, design = ~ Species_Treat_Time)
but, how about the reduced model?
Or should I only combine the first two factors (Species, Treat) and write the full and reduced models as below?
dds <- DESeqDataSetFromMatrix(countData=data, colData=meta, design=~Species_Treat + Time + Species_Treat:Time)
dds_lrt <- DESeq(dds, test="LRT", reduced = ~ Species_Treat + Time)
In this case, resultsNames(dds_lrt) are as following:
"Intercept" "Species_Treat_B.I_vs_B.M" "Species_Treat_A.M_vs_B.M" "Species_Treat_A.I_vs_B.M"
"Time_2_vs_1" "Time_3_vs_1" "Time_4_vs_1" "Time_5_vs_1"
"Time_6_vs_1" "Time_7_vs_1" "Time_8_vs_1"
"Species_TreatB.I.Time2" "Species_TreatA.M.Time2" "Species_TreatA.I.Time2"
"Species_TreatB.I.Time3" "Species_TreatA.M.Time3" "Species_TreatA.I.Time3"
"Species_TreatB.I.Time4" "Species_TreatA.M.Time4" "Species_TreatA.I.Time4"
"Species_TreatB.I.Time5" "Species_TreatA.M.Time5" "Species_TreatA.I.Time5"
"Species_TreatB.I.Time6" "Species_TreatA.M.Time6" "Species_TreatA.I.Time6"
"Species_TreatB.I.Time7" "Species_TreatA.M.Time7" "Species_TreatA.I.Time7"
"Species_TreatB.I.Time8" "Species_TreatA.M.Time8" "Species_TreatA.I.Time8"
The above generates when Species B, Treat M and Time 1 is considered as reference.
Or, should I put all the factors separately into the model? Here, it will be confusing to define the full and reduced models.
I thank you for your help on this post.
Dear Michael, Thank you for your advice in addressing my questions. as per your recommendation, when I applied DESeq function on the whole dds object created as above.
Species Part
I will have the following coefficients for each of the two dds objects, e.g. in dds_setA:
Does contrast = list(c("Time5.TreatI")) in dds_setA compare Inoculation vs Mock at time-point 5 in species A ? if so, does it use Time1 and TreatM as the baseline? What if I want to use Time5 and TreatM as the baseline? What two coefficients will go into the list to do the following comparisons:
AIT5 vs AMT5
BIT5 vs BMT5
Interaction Part
given:
In the interaction part (dds_ALL), there are the following coefficients:
Does contrast = list(c("Time5.TreatI.SpeciesA")) in dds_ALL give the interaction that I want: (AIT5 - AMT5) vs (BIT5 - BMT5)? If not what two coefficients go into the list?
Thank you so much for your time on this post.
Please consult with a statistician for further explanation of the terms in an interaction model, beyond the sections of the vignette and workflow, I think if users are still not sure what is represented by the coefficients, they really should be collaborating to make sure they get it correct.