Question: Three-factor design with DESeq2
0
gravatar for h.valipour.k
6 months ago by
h.valipour.k0 wrote:

Hello everyone, Following reading the DESeq2 vignette and multiple posts in this forum, I still need to post my question regarding analysis design. In the experiment run by my colleagues, there are three factors:

Species (A, B)

Treat (Mock (M), Inoculation (I))

Time (T1,T2,T3,T4,T5,T6,T7,T8)

Each of the two Species has both Treats at each Time points and replicated between 1 to 4 times.

The Questions are:

What genes are differentially expressed for each of the two Species at each of the eight Time points when Inoculation compared to Mock? For example:

AIT5 vs AMT5

What genes are differentially expressed when the two Species are compared to each other at each Time points? This can be achieved in two ways:

Comparing the Inoculations only: AIT5 vs BIT5 Subtracting the Mocks (Interactions): (AIT5 - AMT5) vs (BIT5 - BMT5)

In doing so, what would be best full and reduced models? If I combine the factors first, like paste(Species, Treat, Time, sep="") giving SpeciesTreat_Time, the full l model would be as following:

    dds <- DESeqDataSetFromMatrix(countData=data, colData=meta, design = ~ Species_Treat_Time)

but, how about the reduced model?

Or should I only combine the first two factors (Species, Treat) and write the full and reduced models as below?

    dds <- DESeqDataSetFromMatrix(countData=data, colData=meta, design=~Species_Treat + Time + Species_Treat:Time)
    dds_lrt <- DESeq(dds, test="LRT", reduced = ~ Species_Treat + Time)
In this case, resultsNames(dds_lrt) are as following:

    "Intercept" "Species_Treat_B.I_vs_B.M" "Species_Treat_A.M_vs_B.M" "Species_Treat_A.I_vs_B.M"
    "Time_2_vs_1" "Time_3_vs_1" "Time_4_vs_1" "Time_5_vs_1"  
    "Time_6_vs_1" "Time_7_vs_1" "Time_8_vs_1" 
    "Species_TreatB.I.Time2" "Species_TreatA.M.Time2" "Species_TreatA.I.Time2" 
    "Species_TreatB.I.Time3" "Species_TreatA.M.Time3" "Species_TreatA.I.Time3" 
    "Species_TreatB.I.Time4" "Species_TreatA.M.Time4" "Species_TreatA.I.Time4"
    "Species_TreatB.I.Time5" "Species_TreatA.M.Time5" "Species_TreatA.I.Time5"
    "Species_TreatB.I.Time6" "Species_TreatA.M.Time6" "Species_TreatA.I.Time6"
    "Species_TreatB.I.Time7" "Species_TreatA.M.Time7" "Species_TreatA.I.Time7"
    "Species_TreatB.I.Time8" "Species_TreatA.M.Time8" "Species_TreatA.I.Time8"

The above generates when Species B, Treat M and Time 1 is considered as reference.

Or, should I put all the factors separately into the model? Here, it will be confusing to define the full and reduced models.

I thank you for your help on this post.

deseq2 time-course • 161 views
ADD COMMENTlink modified 6 months ago by Michael Love25k • written 6 months ago by h.valipour.k0
Answer: Three-factor design with DESeq2
1
gravatar for Michael Love
6 months ago by
Michael Love25k
United States
Michael Love25k wrote:

One recommendation I'd have for you is that many of your questions can be easily answered by running a model with a subset of the data. E.g to answer the first question, subset to species A and run a simple time series model with ~time + treatment + treatment:time. This will answer your question directly, and you can look up individual time points (you can just read off the coefficients for time point differences due to treatment`.

That just leaves the per-time point interactions. For this, I'd recommend using all the samples in one dataset with a design including second order interactions, e.g.

~time + treatment + species + treatment:time + species:time + species:treatment + species:treatment:time

For each time point you should have an interaction term telling you if the I vs M effect was the same across species or not.

ADD COMMENTlink written 6 months ago by Michael Love25k

Dear Michael, Thank you for your advice in addressing my questions. as per your recommendation, when I applied DESeq function on the whole dds object created as above.

dds_setA <- DESeq(dds_setA) # subset   for species A
dds_setB <- DESeq(dds_setB) # subset   for species B
dds_ALL  <- DESeq(dds_ALL)  # combined for interaction

Species Part

I will have the following coefficients for each of the two dds objects, e.g. in dds_setA:

> resultsNames(dds_setA)
 [1] "Intercept"    "Time_2_vs_1"   "Time_3_vs_1"   "Time_4_vs_1"  
 [5] "Time_5_vs_1"  "Time_6_vs_1"   "Time_7_vs_1"   "Time_8_vs_1" 
[9] "Treat_I_vs_M"  "Time2.TreatI"  "Time3.TreatI"  "Time4.TreatI" 
[13] "Time5.TreatI" "Time6.TreatI"  "Time7.TreatI"  "Time8.TreatI"

Does contrast = list(c("Time5.TreatI")) in dds_setA compare Inoculation vs Mock at time-point 5 in species A ? if so, does it use Time1 and TreatM as the baseline? What if I want to use Time5 and TreatM as the baseline? What two coefficients will go into the list to do the following comparisons:

AIT5 vs AMT5

BIT5 vs BMT5

Interaction Part

given:

  • Species (A, B); with B the Reference
  • Treat Inoculation (I)), Mock (M); with M the reference
  • Time (T1,T2,T3,T4,T5,T6,T7,T8); with T1 the reference

In the interaction part (dds_ALL), there are the following coefficients:

> resultsNames(dds_ALL)
 [1] "Intercept"    "Time_2_vs_1"   "Time_3_vs_1"     
 [4] "Time_4_vs_1"  "Time_5_vs_1"   "Time_6_vs_1" 
 [7] "Time_7_vs_1"  "Time_8_vs_1"   "Treat_I_vs_M" 
[10] "Species_A_vs_B"   "Time2.TreatI"  "Time3.TreatI" 
[13] "Time4.TreatI" "Time5.TreatI"  "Time6.TreatI"
[16] "Time7.TreatI" "Time8.TreatI"  "Time2.SpeciesA"
[19] "Time3.SpeciesA"   "Time4.SpeciesA"    "Time5.SpeciesA" 
[22] "Time6.SpeciesA"   "Time7.SpeciesA"    "Time8.SpeciesA" 
[25] "TreatI.SpeciesA"  "Time2.TreatI.SpeciesA" "Time3.TreatI.SpeciesA" 
[28] "Time4.TreatI.SpeciesA"    "Time5.TreatI.SpeciesA" "Time6.TreatI.SpeciesA"
[31] "Time7.TreatI.SpeciesA"    "Time8.TreatI.SpeciesA"

Does contrast = list(c("Time5.TreatI.SpeciesA")) in dds_ALL give the interaction that I want: (AIT5 - AMT5) vs (BIT5 - BMT5)? If not what two coefficients go into the list?

Thank you so much for your time on this post.

ADD REPLYlink modified 5 months ago • written 5 months ago by h.valipour.k0

Please consult with a statistician for further explanation of the terms in an interaction model, beyond the sections of the vignette and workflow, I think if users are still not sure what is represented by the coefficients, they really should be collaborating to make sure they get it correct.

ADD REPLYlink written 5 months ago by Michael Love25k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 471 users visited in the last hour