EdgeR multi-factor testing questions
1
0
Entering edit mode
Guest User ★ 13k
@guest-user-4897
Last seen 10.1 years ago
I???m currently using EdgeR to analyze RNASeq data. I have one question of whether I have chosen the correct method for analyzing my multi-factorial experiment. I have 3 factors: factor A with 16 levels (from level0 to level15), factor B with three levels (from level0 to leve2), and factor Sex with two levels (male and female). I???m interested in testing all main effects: A, B and Sex, all two-way interaction terms: A:B, A:Sex and B:Sex; and the three-way interaction term: A:B:Sex. My code: group<-paste(A,B,Sex,sep=".") design<-model.matrix(~A+B+Sex+A:B+A:Sex+B:Sex+A:B:Sex) y<-DGEList(counts=readcounts,group=group) y<-calcNormFactors(y,method="TMM") y<-estimateGLMCommonDisp(y,design) y<-estimateGLMTagwiseDisp(y,design) fit<-glmFit(y,design) colnames(fit) colnames(fit) [1] "(Intercept)" "Alevel1" "Alevel2" "Alevel3" "Alevel4" [6] "Alevel5" "Alevel6" "Alevel7" "Alevel8" "Alevel9" [11]"Alevel10" "Alevel11" "Alevel12" "Alevel13" "Alevel14" [16] "Alevel15" "Blevel1" "Blevel2" "Sexmale" "Alevel1:Blevel1" [21] "Alevel2:Blevel1" "Alevel3:Blevel1" "Alevel4:Blevel1" "Alevel5:Blevel1" "Alevel6:Blevel1" [26] "Alevel7:Blevel1" "Alevel8:Blevel1" "Alevel9:Blevel1" "Alevel10:Blevel1" "Alevel11:Blevel1" [31] "Alevel12:Blevel1" "Alevel13:Blevel1" "Alevel14:Blevel1" "Alevel15:Blevel1" "Alevel1:Blevel2" [36] "Alevel2:Blevel2" "Alevel3:Blevel2" "Alevel4:Blevel2" "Alevel5:Blevel2" "Alevel6:Blevel2" [41] "Alevel7:Blevel2" "Alevel8:Blevel2" "Alevel9:Blevel2" "Alevel10:Blevel2" "Alevel11:Blevel2" [46] "Alevel12:Blevel2" "Alevel13:Blevel2" "Alevel14:Blevel2" "Alevel15:Blevel2" "Alevel1:Sexmale" [51] "Alevel2:Sexmale" "Alevel3:Sexmale" "Alevel4:Sexmale" "Alevel5:Sexmale" "Alevel6:Sexmale" [56] "Alevel7:Sexmale" "Alevel8:Sexmale" "Alevel9:Sexmale" "Alevel10:Sexmale" "Alevel11:Sexmale" [61] "Alevel12:Sexmale" "Alevel13:Sexmale" "Alevel14:Sexmale" "Alevel15:Sexmale" "Blevel1:Sexmale" [66] "Blevel2:Sexmale" "Alevel1:Blevel1:Sexmale" "Alevel2:Blevel1:Sexmale" "Alevel3:Blevel1:Sexmale" "Alevel4:Blevel1:Sexmale" [71] "Alevel5:Blevel1:Sexmale" "Alevel6:Blevel1:Sexmale" "Alevel7:Blevel1:Sexmale" "Alevel8:Blevel1:Sexmale" "Alevel9:Blevel1:Sexmale" [76] "Alevel10:Blevel1:Sexmale" "Alevel11:Blevel1:Sexmale" "Alevel12:Blevel1:Sexmale" "Alevel13:Blevel1:Sexmale" "Alevel14:Blevel1:Sexmale" [81] "Alevel15:Blevel1:Sexmale" "Alevel1:Blevel2:Sexmale" "Alevel2:Blevel2:Sexmale" "Alevel3:Blevel2:Sexmale" "Alevel4:Blevel2:Sexmale" [86] "Alevel5:Blevel2:Sexmale" "Alevel6:Blevel2:Sexmale" "Alevel7:Blevel2:Sexmale" "Alevel8:Blevel2:Sexmale" "Alevel9:Blevel2:Sexmale" [91] "Alevel10:Blevel2:Sexmale" "Alevel11:Blevel2:Sexmale" "Alevel12:Blevel2:Sexmale" "Alevel13:Blevel2:Sexmale" "Alevel14:Blevel2:Sexmale" [96] "Alevel15:Blevel2:Sexmale" (1) To test the main effect, I use the following code: lrt_A<-glmLRT(fit,coef=c(2:16)) lrt_B<-glmLRT(fit,coef=c(17:18)) lrt_Sex<-glmLRT(fit,coef=19) (2) To test the two-way interaction terms, I use the following code: lrt_AB<-glmLRT(fit,coef=c(20:49)) lrt_ASex<-glmLRT(fit,coef=c(50:64)) lrt_BSex<-glmLRT(fit,coef=c(65:66)) (3) To test the three-way interaction terms, I use the following code: lrt_ABSex<-glmLRT(fit,coef=c(67:96)) My question: is my code correct for each of the testing above? Thank you! -- output of sessionInfo(): > sessionInfo() R version 3.0.1 (2013-05-16) Platform: x86_64-w64-mingw32/x64 (64-bit) locale: [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] edgeR_3.2.4 limma_3.16.8 loaded via a namespace (and not attached): [1] tools_3.0.1 -- Sent via the guest posting facility at bioconductor.org.
RNASeq edgeR RNASeq edgeR • 886 views
ADD COMMENT
0
Entering edit mode
@gordon-smyth
Last seen 1 hour ago
WEHI, Melbourne, Australia
Dear Yanzhu, Your analysis is fine from a code point of view. From a statistical point of view however your analysis is too simple because you are neglecting the principle of marginality: http://en.wikipedia.org/wiki/Principle_of_marginality For the model you have fitted, it makes sense to test for the three- way interaction as you do. However it does not make statistical sense to test for the main effects or two-interactions until you have established that the three-way interaction is non-significant. For count data, the tests for the lower-level interactions need to be computed by successively removing each level of interactions from the model. See for example: https://stat.ethz.ch/pipermail/bioconductor/2013-December/056584.html This is the same as the anova() function does in R for unbalanced linear factorial models. Furthermore, testing the two-way interations is only sensible for genes with non-signicant 3-way interactions. Similarly, testing the main effect is only sensible for genes with non-significant 2-way and 3-way interactions. Otherwise these tests have no useful scientific meaning. This is a basic drawback of the factorial anova approach. You might consider the alternative approach described in Section 3.3.1 of the edgeR User's Guide. Best wishes Gordon > Date: Mon, 6 Jan 2014 11:21:37 -0800 (PST) > From: "Yanzhu [guest]" <guest at="" bioconductor.org=""> > To: bioconductor at r-project.org, mlinyzh at gmail.com > Subject: [BioC] EdgeR multi-factor testing questions > > > I'm currently using EdgeR to analyze RNASeq data. I have one question of > whether I have chosen the correct method for analyzing my > multi-factorial experiment. > I have 3 factors: factor A with 16 levels (from level0 to level15), > factor B with three levels (from level0 to leve2), and factor Sex with > two levels (male and female). I'm interested in testing all main > effects: A, B and Sex, all two-way interaction terms: A:B, A:Sex and > B:Sex; and the three-way interaction term: A:B:Sex. > > My code: > > group<-paste(A,B,Sex,sep=".") > design<-model.matrix(~A+B+Sex+A:B+A:Sex+B:Sex+A:B:Sex) > y<-DGEList(counts=readcounts,group=group) > y<-calcNormFactors(y,method="TMM") > > > y<-estimateGLMCommonDisp(y,design) > y<-estimateGLMTagwiseDisp(y,design) > > > fit<-glmFit(y,design) > colnames(fit) > > colnames(fit) > [1] "(Intercept)" "Alevel1" "Alevel2" "Alevel3" "Alevel4" [6] "Alevel5" "Alevel6" "Alevel7" "Alevel8" "Alevel9" > [11]"Alevel10" "Alevel11" "Alevel12" "Alevel13" "Alevel14" > [16] "Alevel15" "Blevel1" "Blevel2" "Sexmale" "Alevel1:Blevel1" > > [21] "Alevel2:Blevel1" "Alevel3:Blevel1" "Alevel4:Blevel1" "Alevel5:Blevel1" "Alevel6:Blevel1" > > [26] "Alevel7:Blevel1" "Alevel8:Blevel1" "Alevel9:Blevel1" "Alevel10:Blevel1" "Alevel11:Blevel1" > > [31] "Alevel12:Blevel1" "Alevel13:Blevel1" "Alevel14:Blevel1" "Alevel15:Blevel1" "Alevel1:Blevel2" > > [36] "Alevel2:Blevel2" "Alevel3:Blevel2" "Alevel4:Blevel2" "Alevel5:Blevel2" "Alevel6:Blevel2" > > [41] "Alevel7:Blevel2" "Alevel8:Blevel2" "Alevel9:Blevel2" "Alevel10:Blevel2" "Alevel11:Blevel2" > > [46] "Alevel12:Blevel2" "Alevel13:Blevel2" "Alevel14:Blevel2" "Alevel15:Blevel2" "Alevel1:Sexmale" > > [51] "Alevel2:Sexmale" "Alevel3:Sexmale" "Alevel4:Sexmale" "Alevel5:Sexmale" "Alevel6:Sexmale" > > [56] "Alevel7:Sexmale" "Alevel8:Sexmale" "Alevel9:Sexmale" "Alevel10:Sexmale" "Alevel11:Sexmale" > > [61] "Alevel12:Sexmale" "Alevel13:Sexmale" "Alevel14:Sexmale" "Alevel15:Sexmale" "Blevel1:Sexmale" > > [66] "Blevel2:Sexmale" "Alevel1:Blevel1:Sexmale" "Alevel2:Blevel1:Sexmale" "Alevel3:Blevel1:Sexmale" "Alevel4:Blevel1:Sexmale" > > [71] "Alevel5:Blevel1:Sexmale" "Alevel6:Blevel1:Sexmale" "Alevel7:Blevel1:Sexmale" "Alevel8:Blevel1:Sexmale" "Alevel9:Blevel1:Sexmale" > > [76] "Alevel10:Blevel1:Sexmale" "Alevel11:Blevel1:Sexmale" "Alevel12:Blevel1:Sexmale" "Alevel13:Blevel1:Sexmale" "Alevel14:Blevel1:Sexmale" > > [81] "Alevel15:Blevel1:Sexmale" "Alevel1:Blevel2:Sexmale" "Alevel2:Blevel2:Sexmale" "Alevel3:Blevel2:Sexmale" "Alevel4:Blevel2:Sexmale" > > [86] "Alevel5:Blevel2:Sexmale" "Alevel6:Blevel2:Sexmale" "Alevel7:Blevel2:Sexmale" "Alevel8:Blevel2:Sexmale" "Alevel9:Blevel2:Sexmale" > > [91] "Alevel10:Blevel2:Sexmale" "Alevel11:Blevel2:Sexmale" "Alevel12:Blevel2:Sexmale" "Alevel13:Blevel2:Sexmale" "Alevel14:Blevel2:Sexmale" > [96] "Alevel15:Blevel2:Sexmale" > > (1) To test the main effect, I use the following code: > lrt_A<-glmLRT(fit,coef=c(2:16)) > lrt_B<-glmLRT(fit,coef=c(17:18)) > lrt_Sex<-glmLRT(fit,coef=19) > > (2) To test the two-way interaction terms, I use the following code: > lrt_AB<-glmLRT(fit,coef=c(20:49)) > > lrt_ASex<-glmLRT(fit,coef=c(50:64)) > > lrt_BSex<-glmLRT(fit,coef=c(65:66)) > > (3) To test the three-way interaction terms, I use the following code: > lrt_ABSex<-glmLRT(fit,coef=c(67:96)) > > My question: is my code correct for each of the testing above? > > > > Thank you! > > > > > > > > > > > -- output of sessionInfo(): > >> sessionInfo() > R version 3.0.1 (2013-05-16) > Platform: x86_64-w64-mingw32/x64 (64-bit) > > locale: > [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 LC_MONETARY=English_United States.1252 > [4] LC_NUMERIC=C LC_TIME=English_United States.1252 > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] edgeR_3.2.4 limma_3.16.8 > > loaded via a namespace (and not attached): > [1] tools_3.0.1 > > > -- > Sent via the guest posting facility at bioconductor.org. ______________________________________________________________________ The information in this email is confidential and intend...{{dropped:4}}
ADD COMMENT

Login before adding your answer.

Traffic: 388 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6