Question

Equivalent of contrasts.fit & multi-contrast decideTests for edgeR?

0

Entering edit mode

Gordon Smyth 50k

@gordon-smyth

Last seen 9 hours ago

WEHI, Melbourne, Australia

Dear Ryan, The edgeR glmLRT() function now allows the contrast to be a matrix with multiple columns. The behavior of this argument is now analogous to the coef argument. When multiple contrasts (or multiple coefs) are specified, the LR tests are conducted for all the contrasts at once, i.e., the LRTs are on several degrees of freedom, one for each independent contrast. This extension allows very general access to anova-type tests using edgeR. I committed this change to the official release of edgeR yesterday. Best wishes Gordon --------------------------------------------- Professor Gordon K Smyth, Bioinformatics Division, Walter and Eliza Hall Institute of Medical Research, 1G Royal Parade, Parkville, Vic 3052, Australia. http://www.statsci.org/smyth On Tue, 20 Nov 2012, Gordon K Smyth wrote: > Dear Ryan, > > It is true that glmLRT() does not allow you to specify multiple contrast > vectors to get an F-like test. There is no good reason for this other than > that we haven't got around to it yet. We will by the next Bioconductor > release. > > The fact that decideTestsDGE doesn't work with multiple coefficients is > unavoidable. Note that decideTests() in limma doesn't give results for > F-tests either, only for individual contrasts. The basic difference between > linear models and glms is that with linear models, the F-test and all the > individual t-tests that make it up can be conducted as one computation, but > for glms each individual 1df test and the overall F-test have to all be > separate computations. This is why we ask you to do multiple calls to > glmLRT() to test for lots of different contrasts, whereas limma can test for > any number of contrasts in one step. > > Best wishes > Gordon > > --------------- original message -------------- > > [BioC] Equivalent of contrasts.fit & multi-contrast decideTests for edgeR? > Ryan C. Thompson rct at thompsonclan.org > Tue Nov 20 02:26:37 CET 2012 > > Hi all, > > I am trying to compare limma (with voom) and edgeR for RNA-seq > differential expression analysis, and I have noticed that while edgeR's > glm functionality closely matches the functionality of limma, one > feature seems to be missing: testing of multiple contrasts. Specifically: > > 1. In glmLRT, the contrast argument only takes a single contrast, not a > matrix of contrasts (as limma's contrasts.fit would); > 2. If glmLRT is used with a coef argument containing 2 or more coefs, > then decideTestsDGE cannot handle the resulting object. > > To illustrate what I mean with an example, consider the following > experimental design with 3 replicates each of 3 timepoints, where I fit > the same data with two equivalent design matrices, one with an intercept > term and one without: > > library(edgeR) > dge <- DGEList(...) # Imagine data here > sampledata <- data.frame(timepoint=, > ) > timepoint <- rep(factor(c("T0", "T1", "T2", "T3")), each=3) > design <- model.matrix(~timepoint) > design.noint <- model.matrix(~0+timepoint) > fit <- glmFit(dge, design) > fit.noint <- glmFit(dge, design.noint) > ## Test for changes in any timepoint > lrt.any.changes <- glmLRT(fit, coef=c(2,3,4)) > ## How can this test be performed on fit.noint? > lrt.any.changes <- glmLRT(fit.noint, ???) > ## This throws an error because the DGELRT has multiple columns > ## "logFC.*" instead of just a single "logFC" that the function > ## expects. > decideTestsDGE(lrt.any.changes) > > > By contrast, with limma I can always do the test that I want regardless > of how I choose to parametrize my design matrix (intercept or not): > > library(limma) > ## Equivalent procedure in limma (I think) > lfit <- lmFit(voom(dge, design)) > lfit.noint <- lmFit(voom(dge, design.noint)) > ## Test for changes in any timepoint (result is in $F.p.value) > lfit <- eBayes(lfit) > ## Same test on the version with no intercept term > contrasts.anychange.noint <- makeContrasts(timepointT1-timepointT0, > timepointT2-timepointT0, > timepointT3-timepointT0, > levels=design.noint) > lfit.noint <- eBayes(contrasts.fit(lfit.noint)) > ## Should give identical results? > decideTests(lfit) > decideTests(lfit.noint) > > Basically, I far as I can tell, with edgeR you can test the null > hypothesis of multiple model coefficients being zero, but not multiple > contrasts, despite the fact that both procedures should be statistically > equivalent. Is edgeR missing this functionality or am I missing the > proper way to do it? Not having this functionality makes things a little > confusing, because depending on which one of several equivalent > parametrizations I choose, different tests are available or not > available, as illustrated by the code above, in which I can only test > the hypothesis of "any change between any time points" if I include an > intercept term. If I'm missing something, can someone pleas enlighten > me? If edgeR really is missing this functionality, is it planned for the > future or is there some fundamental difference between lms and glms that > makes it impossible? > > Thanks, > > -Ryan Thompson > ______________________________________________________________________ The information in this email is confidential and intend...{{dropped:4}}

limma edgeR limma edgeR • 1.4k views

ADD COMMENT • link updated 11.4 years ago by Ryan C. Thompson ★ 7.9k • written 11.4 years ago by Gordon Smyth 50k

score 0 · Answer 1 · 2012-12-09

Dear Gordon, That is great to hear. I've just updated, and I see the new changes are in place. I'll be testing them out this week. Thanks for adding this. -Ryan On Sat 08 Dec 2012 04:15:32 PM PST, Gordon K Smyth wrote: > Dear Ryan, > > The edgeR glmLRT() function now allows the contrast to be a matrix > with multiple columns. The behavior of this argument is now analogous > to the coef argument. When multiple contrasts (or multiple coefs) are > specified, the LR tests are conducted for all the contrasts at once, > i.e., the LRTs are on several degrees of freedom, one for each > independent contrast. > > This extension allows very general access to anova-type tests using > edgeR. > > I committed this change to the official release of edgeR yesterday. > > Best wishes > Gordon > > --------------------------------------------- > Professor Gordon K Smyth, > Bioinformatics Division, > Walter and Eliza Hall Institute of Medical Research, > 1G Royal Parade, Parkville, Vic 3052, Australia. > http://www.statsci.org/smyth > > On Tue, 20 Nov 2012, Gordon K Smyth wrote: > >> Dear Ryan, >> >> It is true that glmLRT() does not allow you to specify multiple >> contrast vectors to get an F-like test. There is no good reason for >> this other than that we haven't got around to it yet. We will by the >> next Bioconductor release. >> >> The fact that decideTestsDGE doesn't work with multiple coefficients >> is unavoidable. Note that decideTests() in limma doesn't give >> results for F-tests either, only for individual contrasts. The basic >> difference between linear models and glms is that with linear models, >> the F-test and all the individual t-tests that make it up can be >> conducted as one computation, but for glms each individual 1df test >> and the overall F-test have to all be separate computations. This is >> why we ask you to do multiple calls to glmLRT() to test for lots of >> different contrasts, whereas limma can test for any number of >> contrasts in one step. >> >> Best wishes >> Gordon >> >> --------------- original message -------------- >> >> [BioC] Equivalent of contrasts.fit & multi-contrast decideTests for >> edgeR? >> Ryan C. Thompson rct at thompsonclan.org >> Tue Nov 20 02:26:37 CET 2012 >> >> Hi all, >> >> I am trying to compare limma (with voom) and edgeR for RNA-seq >> differential expression analysis, and I have noticed that while edgeR's >> glm functionality closely matches the functionality of limma, one >> feature seems to be missing: testing of multiple contrasts. >> Specifically: >> >> 1. In glmLRT, the contrast argument only takes a single contrast, not a >> matrix of contrasts (as limma's contrasts.fit would); >> 2. If glmLRT is used with a coef argument containing 2 or more coefs, >> then decideTestsDGE cannot handle the resulting object. >> >> To illustrate what I mean with an example, consider the following >> experimental design with 3 replicates each of 3 timepoints, where I fit >> the same data with two equivalent design matrices, one with an intercept >> term and one without: >> >> library(edgeR) >> dge <- DGEList(...) # Imagine data here >> sampledata <- data.frame(timepoint=, >> ) >> timepoint <- rep(factor(c("T0", "T1", "T2", "T3")), each=3) >> design <- model.matrix(~timepoint) >> design.noint <- model.matrix(~0+timepoint) >> fit <- glmFit(dge, design) >> fit.noint <- glmFit(dge, design.noint) >> ## Test for changes in any timepoint >> lrt.any.changes <- glmLRT(fit, coef=c(2,3,4)) >> ## How can this test be performed on fit.noint? >> lrt.any.changes <- glmLRT(fit.noint, ???) >> ## This throws an error because the DGELRT has multiple columns >> ## "logFC.*" instead of just a single "logFC" that the function >> ## expects. >> decideTestsDGE(lrt.any.changes) >> >> >> By contrast, with limma I can always do the test that I want regardless >> of how I choose to parametrize my design matrix (intercept or not): >> >> library(limma) >> ## Equivalent procedure in limma (I think) >> lfit <- lmFit(voom(dge, design)) >> lfit.noint <- lmFit(voom(dge, design.noint)) >> ## Test for changes in any timepoint (result is in $F.p.value) >> lfit <- eBayes(lfit) >> ## Same test on the version with no intercept term >> contrasts.anychange.noint <- makeContrasts(timepointT1-timepointT0, >> timepointT2-timepointT0, >> timepointT3-timepointT0, >> levels=design.noint) >> lfit.noint <- eBayes(contrasts.fit(lfit.noint)) >> ## Should give identical results? >> decideTests(lfit) >> decideTests(lfit.noint) >> >> Basically, I far as I can tell, with edgeR you can test the null >> hypothesis of multiple model coefficients being zero, but not multiple >> contrasts, despite the fact that both procedures should be statistically >> equivalent. Is edgeR missing this functionality or am I missing the >> proper way to do it? Not having this functionality makes things a little >> confusing, because depending on which one of several equivalent >> parametrizations I choose, different tests are available or not >> available, as illustrated by the code above, in which I can only test >> the hypothesis of "any change between any time points" if I include an >> intercept term. If I'm missing something, can someone pleas enlighten >> me? If edgeR really is missing this functionality, is it planned for the >> future or is there some fundamental difference between lms and glms that >> makes it impossible? >> >> Thanks, >> >> -Ryan Thompson >> > > ______________________________________________________________________ > The information in this email is confidential and inte...{{dropped:6}}