Testing main effects and interactions in edgeR without contrasts
3
0
Entering edit mode
Guest User ★ 13k
@guest-user-4897
Last seen 9.6 years ago
Dear friends, We are very interested in testing main effects and interactions for a 2x3 factorial RNA-seq project we have been running (we have reasonable replication: 7 biological replicates per treatment combination, i.e. 42 libraries in total). >From our reading of the edgeR Users guide and various posts on-line, it looks as though the package is set-up to test effects via contrasts. Although a contrasts-approach allows the testing of differences between specific treatment combinations, we would like to test interactions and main effects in a more traditional glm/anova format. (We note that testing whether the two contrasts associated with a three-level main effect - a, b, c - are significant is not the same question as testing whether the main effect itself is significant, or of course vice versa.) We would like outputs and tests along the lines of glm() or anova(), not "anova-like" tests that use the results of one or more contrasts to ask similar, but not exactly the same, questions of the data. We apologise in advance if we have either missed something obvious in the user guide or relevant discussion somewhere online. If edgeR does not have this functionality, we will probably use edgeR in conjunction with other packages. As with all the community, we remain in awe of the work the edgeR developers have done, and thank them in advance. Best wishes, Dave Shuker -- output of sessionInfo(): NA -- Sent via the guest posting facility at bioconductor.org.
edgeR edgeR • 3.0k views
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 12 hours ago
United States
Hi David, You can always use voom() (in edgeR) and then analyze the data using limma. Best, Jim On Friday, December 13, 2013 5:25:36 AM, David Shuker [guest] wrote: > > Dear friends, > > We are very interested in testing main effects and interactions for a 2x3 factorial RNA-seq project we have been running (we have reasonable replication: 7 biological replicates per treatment combination, i.e. 42 libraries in total). > > >From our reading of the edgeR Users guide and various posts on- line, it looks as though the package is set-up to test effects via contrasts. Although a contrasts-approach allows the testing of differences between specific treatment combinations, we would like to test interactions and main effects in a more traditional glm/anova format. (We note that testing whether the two contrasts associated with a three-level main effect - a, b, c - are significant is not the same question as testing whether the main effect itself is significant, or of course vice versa.) > > We would like outputs and tests along the lines of glm() or anova(), not "anova-like" tests that use the results of one or more contrasts to ask similar, but not exactly the same, questions of the data. > > We apologise in advance if we have either missed something obvious in the user guide or relevant discussion somewhere online. If edgeR does not have this functionality, we will probably use edgeR in conjunction with other packages. > > As with all the community, we remain in awe of the work the edgeR developers have done, and thank them in advance. > > Best wishes, > > Dave Shuker > > -- output of sessionInfo(): > > NA > > -- > Sent via the guest posting facility at bioconductor.org. > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099
ADD COMMENT
0
Entering edit mode
@ryan-c-thompson-5618
Last seen 8 months ago
Scripps Research, La Jolla, CA
On 12/13/2013 02:25 AM, David Shuker [guest] wrote: > We note that testing whether the two contrasts associated with a three-level main effect - a, b, c - are significant is not the same question as testing whether the main effect itself is significant Are you sure? My understanding is that these two are precisely the same thing. What is the difference? > We would like outputs and tests along the lines of glm() or anova(), not "anova-like" tests that use the results of one or more contrasts to ask similar, but not exactly the same, questions of the data. From everything I've read, the only difference between say, limma's "anova-like" test and an actual ANOVA is the use of empirical Bayes squeezed variances that share information between genes. As far as I know, limma's tests are otherwise equivalent to t-tests/ANOVA tests. Similarly, edgeR's glmLRT is the same as a classical glm likelihood ratio test in every way except for the use of squeezed dispersions. And I think you can even use un-squeezed dispersions by calling estimateDisp with prior.df=0. > > We apologise in advance if we have either missed something obvious in the user guide or relevant discussion somewhere online. If edgeR does not have this functionality, we will probably use edgeR in conjunction with other packages. > > As with all the community, we remain in awe of the work the edgeR developers have done, and thank them in advance. > > Best wishes, > > Dave Shuker > > -- output of sessionInfo(): > > NA > > -- > Sent via the guest posting facility at bioconductor.org. > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD COMMENT
0
Entering edit mode
@gordon-smyth
Last seen 1 hour ago
WEHI, Melbourne, Australia
Dear David, Here's a toy example showing how to use edgeR to test for interactions and main effects. I'll generate Poisson counts so we can compare with glm() in the stats package. Generate counts for two genes and two replicates per treatment combination: y <- matrix(rpois(24,lambda=20),2,12) A <- gl(2,3,12) B <- gl(3,1,12) Assume library sizes are all equal to 10,000. lib.size <- rep(10000,12) offset <- log(lib.size) The standard factorial analysis for the first gene would be as follows: out <- glm(y[1,] ~ A*B, family="poisson", offset=offset) anova(out, test="Chi") When I ran the above code, it gave me the following Analysis of Deviance Table: Df Deviance Resid. Df Resid. Dev Pr(>Chi) NULL 11 25.538 A 1 0.7016 10 24.837 0.4023 B 2 1.4055 8 23.431 0.4952 A:B 2 8.6404 6 14.790 0.0133 * For the second gene, I got: Df Deviance Resid. Df Resid. Dev Pr(>Chi) NULL 11 9.2948 A 1 2.1341 10 7.1607 0.1441 B 2 1.4481 8 5.7127 0.4848 A:B 2 1.4377 6 4.2750 0.4873 Now test for interactions using edgeR: design <- model.matrix(~A*B) fit <- glmFit(y,design,offset=offset,dispersion=0) lrt <- glmLRT(fit,coef=5:6) topTags(lrt) This gives: Coefficient: A2:B2 A2:B3 logFC.A2.B2 logFC.A2.B3 logCPM LR PValue FDR 1 -0.78611 0.59857 11.108 8.6404 0.013297 0.026594 2 0.36661 -0.22905 10.910 1.4377 0.487313 0.487313 Notice that edgeR gives the identical chisquare statistics (8.6404 and 1.4377) and the identical p-values (0.0133 and 0.4873) as does anova() for the interaction A:B. To test for main effects, you would fit the additive model: design <- model.matrix(~A+B) fit <- glmFit(y,design,offset=offset,dispersion=0) Then topTags(glmLRT(fit,coef=3:4)) Coefficient: B2 B3 logFC.B2 logFC.B3 logCPM LR PValue FDR 1 -0.20769 -0.245411 11.108 1.4055 0.49521 0.49521 2 -0.23739 0.039259 10.910 1.4481 0.48479 0.49521 gives the exactly the same statistics for B as in the ANODEV tables. Best wishes Gordon > Date: Fri, 13 Dec 2013 02:25:36 -0800 (PST) > From: "David Shuker [guest]" <guest at="" bioconductor.org=""> > To: bioconductor at r-project.org, david.shuker at st-andrews.ac.uk > Subject: [BioC] Testing main effects and interactions in edgeR without > contrasts > > Dear friends, > > We are very interested in testing main effects and interactions for a > 2x3 factorial RNA-seq project we have been running (we have reasonable > replication: 7 biological replicates per treatment combination, i.e. 42 > libraries in total). > >> From our reading of the edgeR Users guide and various posts on- line, it >> looks as though the package is set-up to test effects via contrasts. >> Although a contrasts-approach allows the testing of differences between >> specific treatment combinations, we would like to test interactions and >> main effects in a more traditional glm/anova format. (We note that >> testing whether the two contrasts associated with a three-level main >> effect - a, b, c - are significant is not the same question as testing >> whether the main effect itself is significant, or of course vice >> versa.) > > We would like outputs and tests along the lines of glm() or anova(), not > "anova-like" tests that use the results of one or more contrasts to ask > similar, but not exactly the same, questions of the data. > > We apologise in advance if we have either missed something obvious in > the user guide or relevant discussion somewhere online. If edgeR does > not have this functionality, we will probably use edgeR in conjunction > with other packages. > > As with all the community, we remain in awe of the work the edgeR > developers have done, and thank them in advance. > > Best wishes, > > Dave Shuker > > -- output of sessionInfo(): > > NA > > -- > Sent via the guest posting facility at bioconductor.org. ______________________________________________________________________ The information in this email is confidential and intend...{{dropped:4}}
ADD COMMENT
0
Entering edit mode
Dear Gordon, Many thanks for your email, and also to Jim for helping to clarify the syntax for removing terms. Very much appreciated. Best, Dave Dear David, Here's a toy example showing how to use edgeR to test for interactions and main effects. I'll generate Poisson counts so we can compare with glm() in the stats package. Generate counts for two genes and two replicates per treatment combination: y <- matrix(rpois(24,lambda=20),2,12) A <- gl(2,3,12) B <- gl(3,1,12) Assume library sizes are all equal to 10,000. lib.size <- rep(10000,12) offset <- log(lib.size) The standard factorial analysis for the first gene would be as follows: out <- glm(y[1,] ~ A*B, family="poisson", offset=offset) anova(out, test="Chi") When I ran the above code, it gave me the following Analysis of Deviance Table: Df Deviance Resid. Df Resid. Dev Pr(>Chi) NULL 11 25.538 A 1 0.7016 10 24.837 0.4023 B 2 1.4055 8 23.431 0.4952 A:B 2 8.6404 6 14.790 0.0133 * For the second gene, I got: Df Deviance Resid. Df Resid. Dev Pr(>Chi) NULL 11 9.2948 A 1 2.1341 10 7.1607 0.1441 B 2 1.4481 8 5.7127 0.4848 A:B 2 1.4377 6 4.2750 0.4873 Now test for interactions using edgeR: design <- model.matrix(~A*B) fit <- glmFit(y,design,offset=offset,dispersion=0) lrt <- glmLRT(fit,coef=5:6) topTags(lrt) This gives: Coefficient: A2:B2 A2:B3 logFC.A2.B2 logFC.A2.B3 logCPM LR PValue FDR 1 -0.78611 0.59857 11.108 8.6404 0.013297 0.026594 2 0.36661 -0.22905 10.910 1.4377 0.487313 0.487313 Notice that edgeR gives the identical chisquare statistics (8.6404 and 1.4377) and the identical p-values (0.0133 and 0.4873) as does anova() for the interaction A:B. To test for main effects, you would fit the additive model: design <- model.matrix(~A+B) fit <- glmFit(y,design,offset=offset,dispersion=0) Then topTags(glmLRT(fit,coef=3:4)) Coefficient: B2 B3 logFC.B2 logFC.B3 logCPM LR PValue FDR 1 -0.20769 -0.245411 11.108 1.4055 0.49521 0.49521 2 -0.23739 0.039259 10.910 1.4481 0.48479 0.49521 gives the exactly the same statistics for B as in the ANODEV tables. Best wishes Gordon > Date: Fri, 13 Dec 2013 02:25:36 -0800 (PST) > From: "David Shuker [guest]" <guest at="" bioconductor.org=""> > To: bioconductor at r-project.org, david.shuker at st-andrews.ac.uk > Subject: [BioC] Testing main effects and interactions in edgeR without > contrasts > > Dear friends, > > We are very interested in testing main effects and interactions for a > 2x3 factorial RNA-seq project we have been running (we have reasonable > replication: 7 biological replicates per treatment combination, i.e. 42 > libraries in total). > >> From our reading of the edgeR Users guide and various posts on- line, it >> looks as though the package is set-up to test effects via contrasts. >> Although a contrasts-approach allows the testing of differences between >> specific treatment combinations, we would like to test interactions and >> main effects in a more traditional glm/anova format. (We note that >> testing whether the two contrasts associated with a three-level main >> effect - a, b, c - are significant is not the same question as testing >> whether the main effect itself is significant, or of course vice >> versa.) > > We would like outputs and tests along the lines of glm() or anova(), not > "anova-like" tests that use the results of one or more contrasts to ask > similar, but not exactly the same, questions of the data. > > We apologise in advance if we have either missed something obvious in > the user guide or relevant discussion somewhere online. If edgeR does > not have this functionality, we will probably use edgeR in conjunction > with other packages. > > As with all the community, we remain in awe of the work the edgeR > developers have done, and thank them in advance. > > Best wishes, > > Dave Shuker > > -- output of sessionInfo(): > > NA > > -- > Sent via the guest posting facility at bioconductor.org. ______________________________________________________________________ The information in this email is confidential and intend...{{dropped:6}}
ADD REPLY

Login before adding your answer.

Traffic: 937 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6