Question

DESeq2, time-course interpretation

0

Entering edit mode

smac97 • 0

@8eea9978

Last seen 9 months ago

United Kingdom

Hi all,

I had a question about DESeq2, or a little sanity check.

In my case I'm only interested in comparing how expression differs between conditions at specific time-points. I believe I can do comparisons with combined factors e.g. Responder_Week1 v NonResponder_Week1 - but I can also do LRT for full time-course comparisons? Is there a particular method that would be better suited? I believe that I can get similar results from LRT by using the names - however I think I've confused myself further - so I could use a hand.

#Code I've been using
#Using response to treatment and time-point as the design for DESeq
dds <- DESeqDataSetFromMatrix(countData = countdata, colData = coldata, design = ~response + time + response:time) 

#Using the time-point as the reducement, which means we are testing the effect of the response?
dds <- DESeq(dds, test = "LRT", reduced = ~time) 

#Get the results, 
results(dds) # However, this will be for all, but we can get a specific comparison by using name and Wald test
results(dds, name = "responseResponder.timeW1", test = "Wald") 
#But then, is this comparing a Non-Responder to Responder at Week 1 or Responder at Week 1 against the Base time-point?

#OR, by combined factors, where condition = e.g. Responder_Week1 
dds <- DESeqDataSetFromMatrix(countData = countdata, colData = coldata, design = ~condition) 

#And now results is simply
results(dds, c("condition","Responder_Week1", "NonResponder_Week1"))

Please do correct me, direct me, or criticise me - I'm only hoping to learn more and do this correctly :)

I hope this makes sense, if there's any confusion please feel free to ask (I am confused myself, hence my reaching out!)

Transcriptomics TimeCourse DESeq2 • 1.3k views

ADD COMMENT • link updated 9 months ago by Michael Love 41k • written 9 months ago by smac97 • 0

score 1 · Answer 1 · 2023-07-08

1

Entering edit mode

Michael Love 41k

@mikelove

Last seen 1 hour ago

United States

Check out the time-series example in the RNA-seq workflow:

https://bioconductor.org/packages/release/workflows/html/rnaseqGene.html

LRT is useful for finding differences between two series at any time point.

ADD COMMENT • link 9 months ago Michael Love 41k

0

Entering edit mode

Thanks for the reply! Really helpful, just to check on the "reduce" section for the LRT, it says:

The following chunk of code performs a likelihood ratio test, where we remove the strain-specific differences over time. Genes with small p values from this test are those which at one or more time points after time 0 showed a strain-specific effect.

So really, my "reduced" in this case should also be reduced = ~ response + time if I want to see the genes that differ after point 0 with a response-specific effect?

And final question, is there anything wrong with simply combining factors for the design condition, e.g. "Responder_Week1 v NonResponder_Week1", if I am already interested in comparing each time-point? Aren't both methods quite similar, or does one have an advantage over the other?

ADD REPLY • link 9 months ago smac97 • 0

2

Entering edit mode

Right -- if you don't include a variable in the reduced model, you are also rejecting null hypothesis whenever that coefficient is non-zero. If you want to specifically test the interaction, you need to have that term as the only one left out in the reduced model.

The model presented in the workflow has the advantage that you can perform a LRT by leaving out the interaction terms. This wouldn't be easily done with the combined variable.

ADD REPLY • link 9 months ago Michael Love 41k

0

Entering edit mode

Awesome, I've ran the code as reduced = ~ time, because I'm more interested in the differences between the strains at each time-point.

For the results, I get that it goes e.g. results(dds, name = "responseResponder.timeW1", test = "Wald") if I'm interested in the differences between responders at week 1. Great.

But if I'm interested in the differences between time-points for either the responders or the non-responders, how do I grab that information? I reckon results(dds, contrast = list(c("responseResponder.timeW1","responseResponder.timeW20")) , test = "Wald for the responders at weeks 1 and 20, but for non-responders what are the terms?

e.g. resultsNames(dds): "Intercept" "res_status_Response_vs_No_Response" "Time_W_1_vs_W_20" "res_statusResponse.TimeW1"

Thanks for the help so far, apologies but I'm just trying to wrap my head around this - the values between these tests and the combined factors are quite different so I'm interested in what seems more significant.

ADD REPLY • link 9 months ago smac97 • 0

0

Entering edit mode

For in depth questions about statistical analysis, I usually ask that users work with someone at their institute who is familiar with linear models in R, it's just that I don't have spare time outside of software related questions.

Another thing you can do to get a handle on what the terms mean and what contrasts imply is to look at this Bioconductor package and paper:

https://f1000research.com/articles/9-512

ADD REPLY • link 9 months ago Michael Love 41k