Question: Time-series analysis with DESeq2
0
gravatar for Assa Yeroslaviz
3 months ago by
Assa Yeroslaviz1.4k
Munich, Germany
Assa Yeroslaviz1.4k wrote:

I was wondering how one can do a deseq2 analysis of multiple time points when trying to test for a WT vs TREAT difference at any of the given times. Let's say I have three time points (1h,2h,3h) and two conditions (WT, TREAT)

In edgeR one can create a contrast matrix and pass the complete matrix to the glm() function.

con<- makeContrasts(
    TP1.WTvs.T  = T1.T - T1.WT,
    TP2.WTvs.T  = T2.T - T2.WT,
    TP3.WTvs.T  = T3.T - T3.WT, levels=design)

glmQLFTest(fit, contrast = TP1.WTvs.T) # single TP comparison
glmQLFTest(fit, contrast = con)        # compare all time points in one go.

With this contrast matrix i can call each contrast separately to detect genes differentially expressed in each of the single time-points or to compare changes in all TP.

I'm interested to understand how this can be done using the DESeq2 package.

I know i can use the interaction terms after running the results() command. something like:

 resultsNames(ddsTC)

##  [1] "Intercept"           "strain_T_vs_WT"    "time_TP1_vs_TP0"      "time_TP2_vs_TP0"     
##  [5] "..."      "..."     "..."     "strainT.timeTP1" 
##  [9] "strainT.timeTP2"  ...

By calling results(dds, name="time_TP2_vs_TP0") I can detect all genes in WT whcih ae changed between TP2 and TP0.

The call for results(dds, name="strainT.timeTP1") will test if the Treat vs WT fold change is different at TP1 than at TP0.

Where i'm not sure is by following this logic, how can I do the same I did with edger? How do I identify genes which are changed across all three time points?

thanks Assa

ADD COMMENTlink modified 3 months ago by Michael Love23k • written 3 months ago by Assa Yeroslaviz1.4k
Answer: Time-series analysis with DESeq2
1
gravatar for Michael Love
3 months ago by
Michael Love23k
United States
Michael Love23k wrote:

What you want to do is described in the time series example in the workflow. You can perform an LRT as shown there.

ADD COMMENTlink written 3 months ago by Michael Love23k

thanks Michael for the fast response.

so transferring it from the workflow, something like this should do it for me?

ddsTC <- DESeqDataSet(counts, ~ strain + time + strain:time)
ddsTC <- DESeq(ddsTC, test="LRT", reduced = ~ strain + time)

Does this apply only for cases where I have separate WT and treated samples for each of the time points?

ADD REPLYlink written 3 months ago by Assa Yeroslaviz1.4k

Yes.

Yes, you need WT and treated samples at every time point.

ADD REPLYlink written 3 months ago by Michael Love23k

great. Is there an interaction where I can look for genes with significant changes in all time points? I mean a constant trendover all time points>

ADD REPLYlink written 3 months ago by Assa Yeroslaviz1.4k

great. Is there an interaction where I can look for genes with significant changes in all time points? I mean a constant trend over all time points>

ADD REPLYlink written 3 months ago by Assa Yeroslaviz1.4k

great. Is there an interaction where I can look for genes with significant changes in all time points? I mean a constant trend over all time points?

ADD REPLYlink written 3 months ago by Assa Yeroslaviz1.4k

There's not really a term for that. You may want to meet with a statistician if you have further questions about how the linear model coefficients in a two group time series work.

ADD REPLYlink written 3 months ago by Michael Love23k

Let's say I have this sample information

          condtion timepoint replicate  group
Sample_1        WT        0h         1  WT_0h
Sample_2        WT        0h         2  WT_0h
Sample_3        WT        0h         3  WT_0h
Sample_4        WT        1h         1  WT_1h
...
Sample_10       WT        8h         1  WT_8h
Sample_11       WT        8h         2  WT_8h
Sample_12       WT        8h         3  WT_8h
Sample_13      KO1        0h         1 KO1_0h
Sample_14      KO1        0h         2 KO1_0h
...
Sample_45      KO3        4h         3 KO3_4h
Sample_46      KO3        8h         1 KO3_8h
Sample_47      KO3        8h         2 KO3_8h
Sample_48      KO3        8h         3 KO3_8h

And I'm analyzing this with Wald and LRT to get the following interaction terms

 dds <- DESeqDataSetFromMatrix(countData = counts,
                          colData = metadata,
                          design = ~ group)
 dds <- DESeq(object = dds, test = "Wald" ) 

res1 <- results(object = dds, contrast=c("group", "KO1_0h" ,"WT_0h"), alpha = 0.001)
res2 <- results(object = dds, contrast=c("group", "KO2_0h" ,"WT_0h"), alpha = 0.001)
res3 <- results(object = dds, contrast=c("group", "KO3_0h" ,"WT_0h"), alpha = 0.001)
...

and than again

dds.lrt <- DESeqDataSetFromMatrix(countData = counts,
                          colData = metadata,
                           design = ~ condtion + timepoint + condtion:timepoint)

dds.lrt <- DESeq(object = dds.lrt, test = "LRT", reduced = ~ condtion + timepoint)

resultsNames(dds.lrt)
 [1] "Intercept"               "condtion_KO1_vs_WT"      "condtion_KO2_vs_WT"      "condtion_KO3_vs_WT"      "timepoint_1h_vs_0h"      "timepoint_4h_vs_0h"     
 [7] "timepoint_8h_vs_0h"      "condtionKO1.timepoint1h" "condtionKO2.timepoint1h" "condtionKO3.timepoint1h" "condtionKO1.timepoint4h" "condtionKO2.timepoint4h"
[13] "condtionKO3.timepoint4h" "condtionKO1.timepoint8h" "condtionKO2.timepoint8h" "condtionKO3.timepoint8h"

res.lrt.1 <- results(object = dds.lrt, name = "condtionKO1.timepoint1h", alpha = 0.001)

In the above test I compare the timepoints and conditions on a pair-wise basis. In the second one I use the LRT test model. "condtion_KO1_vs_WT" gives me genes changed over all timepoints between KO1 and WT, "condtionKO1.timepoint1h" would give me the differences between KO1 vs WT at TP1, controlling for baseline.

What would be the difference between res1 and res.lrt.1? and more importantly, which of the two options would make more sense to analyze?

ADD REPLYlink modified 3 months ago • written 3 months ago by Assa Yeroslaviz1.4k
1

Hi Assa,

I would recommend at this point that you consult with a statistician. I don't have unlimited time for the support site, and so I need to restrict myself to software related questions. Your questions are important, but they have to do with linear modeling, and how different designs answer different questions. If the documentation and workflows we have is not sufficient to bridge the gap, what you would benefit from is to sit down with a statistician or anyone with experience in linear modeling, and they can explain how different coefficients work in a linear model.

ADD REPLYlink written 3 months ago by Michael Love23k

thank you for the time.

ADD REPLYlink modified 3 months ago • written 3 months ago by Assa Yeroslaviz1.4k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 300 users visited in the last hour