Search
Question: DESEQ ANODEV : A time course study
0
gravatar for Michael Breen
4.0 years ago by
Michael Breen20 wrote:

Hi all,

We are running into a problem using an ANODEV with DESeq.

Let me state our data, our aim, and our dilemma.

Our data (time course with various times titled A, B and C):

A design matrix would look something like this (this is a dramatic simplification, our real data contrains over 200 participants) :

Design <- data.frame(row.names=colnames(CountTable)
, timecourse = c("A","A","A","A","A","A",
"B","B","B","B","B","B","C","C","C","C","C","C")
, pairs=c("1", "2", "3", "4", "5", "6","1", "2", "3", "4",
"5","6","1", "2",
"3", "4", "5", "6")
, condition=c("Control", "Control", "Control", "Disease", "Disease",
"Disease","Control", "Control", "Control", "Disease", "Disease",
"Disease","Control", "Control", "Control", "Disease", "Disease",
"Disease"))
countTable <- CountTable
cds <- newCountDataSet(countTable, Design)
cds <- estimateSizeFactors(cds)
sizeFactors( cds )
cds <- estimateDispersions(cds, method="pooled-CR", fitType="local")

Our aim:

What we aim to do is to test for DE of transcripts across all 3 time points for disease and controls seperatly (using DESeq ANODEV) but we want to be able to identify at which time points these transcripts are being DE. In other words, we want to compare DE transcripts with respect to specific time points between cases and controls. Our remaining code looks like this:

fit0 <- fitNbinomGLMs (cds, count ~ timecourse)
fit1 <- fitNbinomGLMs ( cds, count ~ timecourse + condition )
str(fit1)
pvals <- nbinomGLMTest( fit1, fit0 )
padj <- p.adjust( pvals, method="fdr" )
rownames(counts(cds))[ padj < .1 ]

Our delimma:

ANODEV is in fact yielding a list of significantly differentially expressed genes. However, it is not informative if we want to know from which time point this differential expression occurred. From our knowledge there is no implementation of post-hoc testing(or equivilent) after ANODEV in DESeq that states at which time point a gene is being called significant from within a timecourse study.

What insight can anyone offer us regarding DESeq towards better understanding at which time points are our DEGs are related to? In short, how can one tell at which time points (or both) is DESeq calling a transcript significantly DE?

Also, does this approach take into consideration the matching paired data?

Yours,
Michael

ADD COMMENTlink modified 2.4 years ago by Gordon Smyth30k • written 4.0 years ago by Michael Breen20
0
gravatar for Simon Anders
4.0 years ago by
Simon Anders3.3k
Zentrum für Molekularbiologie, Universität Heidelberg
Simon Anders3.3k wrote:
Hi Michael On 08/06/13 03:00, Michael Breen wrote: > What we aim to do is to test for DE of transcripts across all 3 time > points for disease and controls seperatly (using DESeq ANODEV) but we want > to be able to identify at which > time points these transcripts are being DE. In other words, we want to > compare DE transcripts with > respect to specific time points between cases and controls. Our remaining > code looks like this: > > fit0 <- fitNbinomGLMs (cds, count ~ timecourse) > fit1 <- fitNbinomGLMs ( cds, count ~ timecourse + condition ) > str(fit1) One possibility would be to subset your data to only samples from one time point and then test cases against control to see the genes that are DE at this time point, then go on to the next one. If you consider this a post-hoc test and only look at the genes which show overall sensitivity, you can probably be more lenient on the significance threshold. Maybe other people on the list have input on this point. Simon
ADD COMMENTlink written 4.0 years ago by Simon Anders3.3k
0
gravatar for Gordon Smyth
4.0 years ago by
Gordon Smyth30k
Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia
Gordon Smyth30k wrote:

Dear Michael,

What you want to do is easy and fast using the edgeR package, without any need for ad hoc workarounds like subsetting your data. See McCarthy et all (NAR 2012):

  http://www.ncbi.nlm.nih.gov/pubmed/22287627

Your experimental design seems to fit exactly into the framework of Section 3.3 of the edgeR User's Guide.

Best wishes

Gordon

ADD COMMENTlink modified 2.4 years ago • written 4.0 years ago by Gordon Smyth30k

Cheers!

I think this is applicable to our needs.

Thank you Gentlemen!

ADD REPLYlink modified 2.4 years ago by Gordon Smyth30k • written 4.0 years ago by Michael Breen20
0
gravatar for Michael Breen
3.9 years ago by
Michael Breen350
Michael Breen350 wrote:

Gordon!

On second pass, edgeR doesn't seem to address my question.

We have disease and control data spread over 3 time points (baseline-T0, time after disease causing event-T1, elapsed time after event-T3) for each of these two conditions. I want to be able to find DEGs accordingly to condition across time-points to decipher which times (either at treatment or elapsed time after treatment) our genes are DE.

Any further insight would be fantastic.

Yours,
Michael

ADD COMMENTlink modified 2.4 years ago by Gordon Smyth30k • written 3.9 years ago by Michael Breen350

Hi Michael,

The simplest and most understandable approach I've found is to define a single factor that splits your data into all the groups. So there would be levels for disease-T0, disease-T1, disease-t3, control-T0, control-T1, and control-T3. Then you can use ~0+group as your model formula and define whatever contrasts you desire between these six groups.

-Ryan Thompson

ADD REPLYlink modified 2.4 years ago by Gordon Smyth30k • written 3.9 years ago by Ryan C. Thompson5.7k

Ryan,

If one were to split it up, what's the difference between these contrasts and a classical class comparison approach? Here you seem to be comparing gene expression differences between two classes while igorning matching data.

Any ideas?
Michael

ADD REPLYlink modified 2.4 years ago by Gordon Smyth30k • written 3.9 years ago by Michael Breen350

Hi Michael,

Further to Ryan's post, your experimental design seems to fit exactly into the framework of Section 3.3 of the edgeR User's Guide.  The approach that Ryan recommends is that of Section 3.3.1.  Section 3.3.2 describes another equivalent way to setup the comparisons you want.

Note that you have previously considered additive models like:

   ~ timecourse + condition

but this is not correct because it assumes that the time course effects are exactly the same for both conditions.  If you want to test for time effects for each condition separately, then you must use the all-combination models considered in Section 3.3 of the edgeR User's Guide.

Best wishes
Gordon

ADD REPLYlink modified 2.4 years ago • written 3.9 years ago by Gordon Smyth30k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 258 users visited in the last hour