Greetings,
I have used the DESeq package previously and have been recently using
DESeq2. I am particularly interested in repeated measures designs and
was
wondering about applications with DESeq2. I have read through the
manual
and tried searching the archives but couldn't find too much direction
for
testing over all timepoints instead of just one at a time (ANOVA-like
approach). Reading the edgeR manual, it provides an example in
section
3.3.4 that tests whether a treatment has an effect at any time by
taking
multiple coefficients (i.e. lrt <- glmLRT(fit, coef=5:6)). I
attempted
something similar with DESeq2:
res <- results(dds, name=resultsNames(dds)[5:6]
but I got the warning message saying only the first element used:
Warning message:In if (paste0("WaldPvalue_", name) %in%
names(mcols(object))) { :
the condition has length > 1 and only the first element will be used
Is there functionality with DESeq for looking over all timepoints or
should
I stick to using edgeR for these types of experimental designs? For
context, I prefer to use multiple techniques instead of just one for
further support given that no standard exists.
Many thanks,
--
Charles Determan
Integrated Biosciences PhD Candidate
University of Minnesota
[[alternative HTML version deleted]]
hi Charles,
On Tue, Jul 9, 2013 at 3:59 PM, Charles Determan Jr <deter088 at="" umn.edu=""> wrote:
> Greetings,
>
> I have used the DESeq package previously and have been recently
using
> DESeq2. I am particularly interested in repeated measures designs
and was
> wondering about applications with DESeq2. I have read through the
manual
> and tried searching the archives but couldn't find too much
direction for
> testing over all timepoints instead of just one at a time (ANOVA-
like
> approach). Reading the edgeR manual, it provides an example in
section
> 3.3.4 that tests whether a treatment has an effect at any time by
taking
> multiple coefficients (i.e. lrt <- glmLRT(fit, coef=5:6)). I
attempted
> something similar with DESeq2:
>
> res <- results(dds, name=resultsNames(dds)[5:6]
>
> but I got the warning message saying only the first element used:
>
> Warning message:In if (paste0("WaldPvalue_", name) %in%
> names(mcols(object))) { :
> the condition has length > 1 and only the first element will be
used
>
I should clean up the code to provide a warning here, as the results()
function should only accept a character vector of length 1 for the
argument 'name'.
The proper way to test for the significance of multiple coefficients
at once is to use the nbinomLRT() function in DESeq2 and specify a
reduced formula. To test whether the treatment effect at all times is
different than at the baseline time, the reduced formula would remove
the interaction term between treatment and time, so:
design(dds) <- formula(~ time + treatment + treatment:time)
dds <- estimateSizeFactors(dds)
dds <- estimateDispersions(dds)
dds <- nbinomLRT(dds, reduced = formula(~ time + treatment))
res <- results(dds)
If you presume that the treatment effect is the same at all times, you
can test whether the treatment effect is equal to 0 with:
# using the Wald test and coefficient shrinkage
design(dds) <- formula(~ time + treatment)
dds <- DESeq(dds)
res <- results(dds)
# or using the likelihood ratio test as in the previous example
design(dds) <- formula(~ time + treatment)
dds <- estimateSizeFactors(dds)
dds <- estimateDispersions(dds)
dds <- nbinomLRT(dds, reduced = formula(~ time))
res <- results(dds)
The main difference here between the Wald and LRT tests is the
shrinkage of estimated log2 fold changes to 0 using the default
DESeq() function/Wald test.
I will add more examples to the vignette to better explain these cases
of testing multiple coefficients.
Mike