EdgeR multi-factors testing questions
3
0
Entering edit mode
Guest User ★ 13k
@guest-user-4897
Last seen 9.6 years ago
Hi Gordon, I have one more question about edgeR. I have used different normalization methods to normalize the data and then test the main effects, the two-way interaction terms and the three-way interaction term as we discussed before. To my surprise, the P-values results of the testings are the same for data normalized by TMM and Upper quartile. Is it possible? Best, Yanzhu --------------------------------------------------------- Dear Yanzhu, Yes, that's how I would do it. Keep the same dispersions for all fits. Best wishes Gordon -- output of sessionInfo(): > sessionInfo() R version 3.0.1 (2013-05-16) Platform: x86_64-w64-mingw32/x64 (64-bit) locale: [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] parallel stats graphics grDevices utils datasets methods base other attached packages: [1] DESeq_1.12.1 lattice_0.20-15 locfit_1.5-9.1 Biobase_2.20.1 BiocGenerics_0.6.0 [6] edgeR_3.2.4 limma_3.16.8 loaded via a namespace (and not attached): [1] annotate_1.38.0 AnnotationDbi_1.22.6 DBI_0.2-7 genefilter_1.42.0 [5] geneplotter_1.38.0 grid_3.0.1 IRanges_1.18.4 RColorBrewer_1.0-5 [9] RSQLite_0.11.4 splines_3.0.1 stats4_3.0.1 survival_2.37-4 [13] tools_3.0.1 XML_3.98-1.1 xtable_1.7-1 -- Sent via the guest posting facility at bioconductor.org.
edgeR edgeR • 1.1k views
ADD COMMENT
0
Entering edit mode
@ryan-c-thompson-5618
Last seen 8 months ago
Scripps Research, La Jolla, CA
I don't think it's surprising. TMM and upper quartile tend to give very similar normalization factors. Are you getting *exactly* identical P-values, or only very similar ones? Also, you can check the normalization factors themselves to see how much of a difference there is. If "d" is your DGEList object, you can get the normalization factors by typeing "d$samples$norm.factors. -Ryan On Mon 27 Jan 2014 05:16:13 AM PST, Yanzhu [guest] wrote: > > Hi Gordon, > > I have one more question about edgeR. I have used different normalization methods to normalize the data and then test the main effects, the two-way interaction terms and the three-way interaction term as we discussed before. To my surprise, the P-values results of the testings are the same for data normalized by TMM and Upper quartile. Is it possible? > > > Best, > > > Yanzhu > > > --------------------------------------------------------- > > > Dear Yanzhu, > > Yes, that's how I would do it. Keep the same dispersions for all fits. > > Best wishes > Gordon > > > > -- output of sessionInfo(): > >> sessionInfo() > R version 3.0.1 (2013-05-16) > Platform: x86_64-w64-mingw32/x64 (64-bit) > > locale: > [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 > [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C > [5] LC_TIME=English_United States.1252 > > attached base packages: > [1] parallel stats graphics grDevices utils datasets methods base > > other attached packages: > [1] DESeq_1.12.1 lattice_0.20-15 locfit_1.5-9.1 Biobase_2.20.1 BiocGenerics_0.6.0 > [6] edgeR_3.2.4 limma_3.16.8 > > loaded via a namespace (and not attached): > [1] annotate_1.38.0 AnnotationDbi_1.22.6 DBI_0.2-7 genefilter_1.42.0 > [5] geneplotter_1.38.0 grid_3.0.1 IRanges_1.18.4 RColorBrewer_1.0-5 > [9] RSQLite_0.11.4 splines_3.0.1 stats4_3.0.1 survival_2.37-4 > [13] tools_3.0.1 XML_3.98-1.1 xtable_1.7-1 > > > > -- > Sent via the guest posting facility at bioconductor.org. > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD COMMENT
0
Entering edit mode
Guest User ★ 13k
@guest-user-4897
Last seen 9.6 years ago
Hi Ryan, The P-values are exactly identical, please see the example of the testing the three-way interaction (as below (II)): (I) I have checked the normalization factors for both methods, (i) TMM: head(y$sample$norm.factors) [1] 0.9404963 0.9056276 0.9928999 0.9961922 1.0109106 0.9115558 (ii) UQ: head(y$sample$norm.factors) [1] 0.9883357 0.9625269 1.0414372 1.0143681 1.1134091 0.8992270 (II) Example: the following is how I test the three-way interaction term: (i) TMM: group<-paste(L,S,R,sep=".") design<-model.matrix(~L+R+S+L:R+L:S+R:S+L:R:S) y<-DGEList(counts=counts,group=group) y<-calcNormFactors(y,method="TMM") offset=log((y$sample)[,2]) y<-estimateGLMCommonDisp(y,design) y<-estimateGLMTagwiseDisp(y,design) fiteTMM_LRS<-glmFit(y,design,offset=offset ) ### testing the three-way interaction term L:R:S lrteTMM_LRS<-glmLRT(fiteTMM_LRS,coef=c(67:96)) ### P-values: head((lrteTMM_LRS$table)[,33]) [1] 1.233769e-30 5.648507e-30 4.254337e-06 9.304154e-05 8.504918e-01 1.075495e-41 (ii) UQ: ##################### UQ group<-paste(L,S,R,sep=".") design<-model.matrix(~L+R+S+L:R+L:S+R:S+L:R:S) y<-DGEList(counts=counts,group=group) y<-calcNormFactors(y,method="upperquartile",p=0.75) offset=log((y$sample)[,2]) y<-estimateGLMCommonDisp(y,design) y<-estimateGLMTagwiseDisp(y,design) fiteUQ_LRS<-glmFit(y,design,offset=offset ) ### testing the three-way interaction term L:R:S lrteUQ_LRS<-glmLRT(fiteUQ_LRS,coef=c(67:96)) ### P-values: head((lrteUQ_LRS$table)[,33]) [1] 1.233769e-30 5.648507e-30 4.254337e-06 9.304154e-05 8.504918e-01 1.075495e-41 The P-values from both normalization methods are exactly identical, here I only show part of it. Thanks. Yanzhu ------------------------------------------------------- I don't think it's surprising. TMM and upper quartile tend to give very similar normalization factors. Are you getting *exactly* identical P-values, or only very similar ones? Also, you can check the normalization factors themselves to see how much of a difference there is. If "d" is your DGEList object, you can get the normalization factors by typeing "d$samples$norm.factors. -Ryan -- output of sessionInfo(): sessionInfo() R version 3.0.1 (2013-05-16) Platform: x86_64-w64-mingw32/x64 (64-bit) locale: [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C LC_TIME=English_United States.1252 attached base packages: [1] parallel stats graphics grDevices utils datasets methods base other attached packages: [1] DESeq_1.12.1 lattice_0.20-15 locfit_1.5-9.1 Biobase_2.20.1 BiocGenerics_0.6.0 edgeR_3.2.4 limma_3.16.8 loaded via a namespace (and not attached): [1] annotate_1.38.0 AnnotationDbi_1.22.6 DBI_0.2-7 genefilter_1.42.0 geneplotter_1.38.0 grid_3.0.1 [7] IRanges_1.18.4 RColorBrewer_1.0-5 RSQLite_0.11.4 splines_3.0.1 stats4_3.0.1 survival_2.37-4 [13] XML_3.98-1.1 xtable_1.7-1 -- Sent via the guest posting facility at bioconductor.org.
ADD COMMENT
0
Entering edit mode
@gordon-smyth
Last seen 29 minutes ago
WEHI, Melbourne, Australia
Dear Yanzhu, The culprit is your code here: glmFit(y,design,offset=offset ) whereby you are feeding your own offset to glmFit() and hence overwriting any normalization that has been done. Why are you doing this? If you just remove the offset=offset argument and allow edgeR to work as normal, then the results will be correct. Best wishes Gordon > Date: Tue, 28 Jan 2014 06:38:26 -0800 (PST) > From: "Yanzhu [guest]" <guest at="" bioconductor.org=""> > To: bioconductor at r-project.org, mlinyzh at gmail.com > Subject: [BioC] EdgeR multi-factors testing questions > > > Hi Ryan, > > The P-values are exactly identical, please see the example of the testing the three-way interaction (as below (II)): > > (I) I have checked the normalization factors for both methods, > (i) TMM: > head(y$sample$norm.factors) > [1] 0.9404963 0.9056276 0.9928999 0.9961922 1.0109106 0.9115558 > > (ii) UQ: > head(y$sample$norm.factors) > [1] 0.9883357 0.9625269 1.0414372 1.0143681 1.1134091 0.8992270 > > (II) Example: the following is how I test the three-way interaction term: > (i) TMM: > group<-paste(L,S,R,sep=".") > design<-model.matrix(~L+R+S+L:R+L:S+R:S+L:R:S) > y<-DGEList(counts=counts,group=group) > y<-calcNormFactors(y,method="TMM") > offset=log((y$sample)[,2]) > y<-estimateGLMCommonDisp(y,design) > y<-estimateGLMTagwiseDisp(y,design) > > fiteTMM_LRS<-glmFit(y,design,offset=offset ) > > ### testing the three-way interaction term L:R:S > lrteTMM_LRS<-glmLRT(fiteTMM_LRS,coef=c(67:96)) > > ### P-values: > head((lrteTMM_LRS$table)[,33]) > [1] 1.233769e-30 5.648507e-30 4.254337e-06 9.304154e-05 8.504918e-01 1.075495e-41 > > (ii) UQ: > ##################### UQ > group<-paste(L,S,R,sep=".") > design<-model.matrix(~L+R+S+L:R+L:S+R:S+L:R:S) > y<-DGEList(counts=counts,group=group) > y<-calcNormFactors(y,method="upperquartile",p=0.75) > offset=log((y$sample)[,2]) > y<-estimateGLMCommonDisp(y,design) > y<-estimateGLMTagwiseDisp(y,design) > > fiteUQ_LRS<-glmFit(y,design,offset=offset ) > > ### testing the three-way interaction term L:R:S > lrteUQ_LRS<-glmLRT(fiteUQ_LRS,coef=c(67:96)) > > ### P-values: > head((lrteUQ_LRS$table)[,33]) > [1] 1.233769e-30 5.648507e-30 4.254337e-06 9.304154e-05 8.504918e-01 1.075495e-41 > > The P-values from both normalization methods are exactly identical, here I only show part of it. > > > Thanks. > > > Yanzhu > > ------------------------------------------------------- > I don't think it's surprising. TMM and upper quartile tend to give very > similar normalization factors. Are you getting *exactly* identical > P-values, or only very similar ones? Also, you can check the > normalization factors themselves to see how much of a difference there > is. If "d" is your DGEList object, you can get the normalization > factors by typeing "d$samples$norm.factors. > > -Ryan > > > > -- output of sessionInfo(): > > sessionInfo() > R version 3.0.1 (2013-05-16) > Platform: x86_64-w64-mingw32/x64 (64-bit) > > locale: > [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 LC_MONETARY=English_United States.1252 > [4] LC_NUMERIC=C LC_TIME=English_United States.1252 > > attached base packages: > [1] parallel stats graphics grDevices utils datasets methods base > > other attached packages: > [1] DESeq_1.12.1 lattice_0.20-15 locfit_1.5-9.1 Biobase_2.20.1 BiocGenerics_0.6.0 edgeR_3.2.4 limma_3.16.8 > > loaded via a namespace (and not attached): > [1] annotate_1.38.0 AnnotationDbi_1.22.6 DBI_0.2-7 genefilter_1.42.0 geneplotter_1.38.0 grid_3.0.1 > [7] IRanges_1.18.4 RColorBrewer_1.0-5 RSQLite_0.11.4 splines_3.0.1 stats4_3.0.1 survival_2.37-4 > [13] XML_3.98-1.1 xtable_1.7-1 ______________________________________________________________________ The information in this email is confidential and intend...{{dropped:4}}
ADD COMMENT

Login before adding your answer.

Traffic: 922 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6