edgeR: tagwise dispersion in 2-factorial vs. 1-factorial design
1
0
Entering edit mode
@henning-wildhagen-5190
Last seen 9.6 years ago
Hi, i am analysing a two-factorial RNA-seq experiment with edgeR. The design of my study has two factors, genotype and treatment. Genotype has three levels (A,B,C), "treatment" has two levels ("control", "stress"). The first and most important question that i want to answer is which transcripts are affected by treatment in each of the three genotypes. I did this analysis by specifying a two- factorial model and subsequently selecting coefficients/contrasts to test for the treatment effect genotype-wise. Of course, this type of analysis can also be done in a 1-factorial way, i.e. by defining three separate DGEList-objects for each genotype and then performing an exactTest for the treatment effect for each of the three DGEList-objects/genotypes. For one of the genotypes, say "A", the latter analysis gives approximately 60% more DE genes compared to the DE-analysis based on the 2-factorial model. For the other two genotypes, the number of DE genes is almost the same in the two analyses. My first guess was, that this finding this related to the differences in the estimation of the tagwise dispersion. In the two-factorial analysis, one and the same dispersion estimate per transcript is used to test for DE. In the 1-factorial analysis, three dispersion estimates are calculated per transcript, one for each genotype. When comparing the distributions of genotype-wise dispersion estimates of the 1-factorial analysis with the "common" tagwise dispersion of the 2-factorial model, i see that the median is higher and the range of the 95%tiles is wider for genotypes B, C and the "common" dispersion of the 2-factorial model, compared to genotype "A". Now my question is which analysis is more reliable, the 2-factorial or the 1-factorial? Thanks for any help or comments on this problem, Henning ------------------------------------------------------ Dr. Henning Wildhagen Forest Research Institute Baden-W?rttemberg Freiburg, Germany -- Jetzt informieren: http://mobile.1und1.de/?ac=OM.PW.PW003K20328T7073a
edgeR edgeR • 1.5k views
ADD COMMENT
0
Entering edit mode
@gordon-smyth
Last seen 10 hours ago
WEHI, Melbourne, Australia
Dear Henning, Making decisions about how whether to analyse a data set as a whole or in pieces depends on the specifics of your problem and your data, and there is no univeral answer. I can tell you however that I almost always analyse all the data from one study together, i.e., I would most often use the 2-factorial approach. Generally it pays to pool information about the dispersion from multiple groups. Of course you should do some exploratory analysis using a MDS plot or similar to see if there are any problem libraries, for any of the three genotypes. Best wishes Gordon > Date: Mon, 23 Apr 2012 14:07:55 +0200 > From: "Henning Wildhagen" <hwildhagen at="" gmx.de=""> > To: bioconductor at r-project.org > Subject: [BioC] edgeR: tagwise dispersion in 2-factorial vs. > 1-factorial design > > Hi, > > i am analysing a two-factorial RNA-seq experiment with edgeR. The design > of my study has two factors, genotype and treatment. Genotype has three > levels (A,B,C), "treatment" has two levels ("control", "stress"). The > first and most important question that i want to answer is which > transcripts are affected by treatment in each of the three genotypes. I > did this analysis by specifying a two-factorial model and subsequently > selecting coefficients/contrasts to test for the treatment effect > genotype-wise. Of course, this type of analysis can also be done in a > 1-factorial way, i.e. by defining three separate DGEList-objects for > each genotype and then performing an exactTest for the treatment effect > for each of the three DGEList-objects/genotypes. For one of the > genotypes, say "A", the latter analysis gives approximately 60% more DE > genes compared to the DE-analysis based on the 2-factorial model. For > the other two genotypes, the number of DE genes is almost the same in > the two analyses. My first guess was, that this finding this related to > the differences in the estimation of the tagwise dispersion. In the > two-factorial analysis, one and the same dispersion estimate per > transcript is used to test for DE. In the 1-factorial analysis, three > dispersion estimates are calculated per transcript, one for each > genotype. When comparing the distributions of genotype-wise dispersion > estimates of the 1-factorial analysis with the "common" tagwise > dispersion of the 2-factorial model, i see that the median is higher and > the range of the 95%tiles is wider for genotypes B, C and the "common" > dispersion of the 2-factorial model, compared to genotype "A". > Now my question is which analysis is more reliable, the 2-factorial or the 1-factorial? > > Thanks for any help or comments on this problem, > > Henning > > ------------------------------------------------------ > Dr. Henning Wildhagen > Forest Research Institute Baden-W?rttemberg > Freiburg, Germany > -- > ______________________________________________________________________ The information in this email is confidential and intend...{{dropped:4}}
ADD COMMENT

Login before adding your answer.

Traffic: 963 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6