[Bioc-sig-seq] interaction factor in edgeR
1
0
Entering edit mode
@biase-fernando-4475
Last seen 10.3 years ago
Dear Prof Smyth, in design <- model.matrix(~ a + b + a:b , data=targets) my interest is in factor a (coef=2). ""Do you expect the effect of experimental factor b to be same for each level of a? If yes, then maybe you don't need the interaction term. It depends on your experiment and on the questions you want to ask."""" I am not sure, but I guess the answer is no. The experiment consists of embryos collected at two time points (factor a), normal or cloned embryos (factor b). And on top of it, it is an unbalanced sample. I have previously tested the hypothesis of whether cloning affects the gene expression, for which I do not need the first factor (a). I am using the factor b as a block to test the hypothesis of whether the expression is different between time points (factor a). Please, let me know if you think otherwise. thanks for the reply, Fernando ________________________________________ From: Gordon K Smyth [smyth@wehi.EDU.AU] Sent: Tuesday, May 10, 2011 6:53 PM To: Biase, Fernando Cc: bioc-sig-sequencing at r-project.org Subject: [Bioc-sig-seq] interaction factor in edgeR Dear Fernando, > Date: Tue, 10 May 2011 13:40:23 -0500 > From: "Biase, Fernando" <biase at="" illinois.edu=""> > To: "bioc-sig-sequencing at r-project.org" > <bioc-sig-sequencing at="" r-project.org=""> > Subject: [Bioc-sig-seq] interaction factor in edgeR > > Dear list users, > > I am not a statistician, so pardon my ignorance. > > When using edgeR package to analyse RNA-seq data the number of > differential expressed genes vary depending on whether I use an > interaction factor in the design. Can anyone suggest why does it happen? Well, you fit a different model, and test a different hypothesis, so the results change. No doubt the residual dispersion has changed as well. Wouldn't you be worried if the results didn't change? > Example: > > if I use: > design <- model.matrix(~ a + b , data=targets) > > I have: > summary(decideTests_eset_b_tmm) > [,1] > -1 2855 > 0 12346 > 1 4928 > > if I use: > design <- model.matrix(~ a + b + a:b , data=targets) > > then: > summary(decideTests_eset_b_tmm) > [,1] > -1 3343 > 0 9490 > 1 4191 You haven't actually told us which coefficient you're testing for. > When having more than one factor, is it more appropriate to have the > interaction factor in the design? Do you expect the effect of experimental factor b to be same for each level of a? If yes, then maybe you don't need the interaction term. It depends on your experiment and on the questions you want to ask. > Thanks a lot > Best, > > Fernando BTW, I would much prefer it if you would post questions about edgeR to the main Bioconductor mailing list rather than to bioc-sig-sequencing. The questions relate more to the general problem of analysing gene expression experiments rather than to details of particular sequencing technologies. Best wishes Gordon --------------------------------------------- Professor Gordon K Smyth, Bioinformatics Division, Walter and Eliza Hall Institute of Medical Research, 1G Royal Parade, Parkville, Vic 3052, Australia. Tel: (03) 9345 2326, Fax (03) 9347 0852, smyth at wehi.edu.au http://www.wehi.edu.au http://www.statsci.org/smyth ______________________________________________________________________ The information in this email is confidential and intend...{{dropped:6}}
edgeR edgeR • 1.0k views
ADD COMMENT
0
Entering edit mode
@gordon-smyth
Last seen 4 hours ago
WEHI, Melbourne, Australia
Dear Biase, Your questions are really general questions about two way models, rather than specifically to do with edgeR. I'll try to give some general advice, but ultimately it depends on your own scientific questions. First point, if you fit an interaction model, it doesn't usually make sense to test for a main effect (like the term 'a' in your model below), at least not unless you really know what you're doing. The interaction model implies that the time effect depends on the type of embryo, so there isn't a single unambiguous time effect to test for. If you really want to use embryo type as a blocking variable, you need to remove the interaction. In that case, factor 'a' would be interpreted as a time effect that is consistent across the two embryo types. If instead you want to test for separate time effects in normal and cloned embryos, then the best way to do that would be to treat your experiment as having one factor with four levels: NormalTime1, NormalTime2, ClonedTime1, ClonedTime2. Then you could test for a time effect for cloned embryos from the pairwise comparison ClonedTime2-ClonedTime1 and so on for other comparisons. Most biologists find this to be a more self-explanatory way to proceed that using the model formulas. You can easily do with the "classic" edgeR approach, i.e., you don't need the GLM functions. Best wishes Gordon On Tue, 10 May 2011, Biase, Fernando wrote: > Dear Prof Smyth, > > in > > design <- model.matrix(~ a + b + a:b , data=targets) > > my interest is in factor a (coef=2). > > "Do you expect the effect of experimental factor b to be same for each > level of a? If yes, then maybe you don't need the interaction term. > It depends on your experiment and on the questions you want to ask." > > I am not sure, but I guess the answer is no. The experiment consists of > embryos collected at two time points (factor a), normal or cloned > embryos (factor b). And on top of it, it is an unbalanced sample. I have > previously tested the hypothesis of whether cloning affects the gene > expression, for which I do not need the first factor (a). I am using the > factor b as a block to test the hypothesis of whether the expression is > different between time points (factor a). > > Please, let me know if you think otherwise. > > thanks for the reply, > > Fernando > > ________________________________________ > From: Gordon K Smyth [smyth at wehi.EDU.AU] > Sent: Tuesday, May 10, 2011 6:53 PM > To: Biase, Fernando > Cc: bioc-sig-sequencing at r-project.org > Subject: [Bioc-sig-seq] interaction factor in edgeR > > Dear Fernando, > >> Date: Tue, 10 May 2011 13:40:23 -0500 >> From: "Biase, Fernando" <biase at="" illinois.edu=""> >> To: "bioc-sig-sequencing at r-project.org" >> <bioc-sig-sequencing at="" r-project.org=""> >> Subject: [Bioc-sig-seq] interaction factor in edgeR >> >> Dear list users, >> >> I am not a statistician, so pardon my ignorance. >> >> When using edgeR package to analyse RNA-seq data the number of >> differential expressed genes vary depending on whether I use an >> interaction factor in the design. Can anyone suggest why does it happen? > > Well, you fit a different model, and test a different hypothesis, so the > results change. No doubt the residual dispersion has changed as well. > Wouldn't you be worried if the results didn't change? > >> Example: >> >> if I use: >> design <- model.matrix(~ a + b , data=targets) >> >> I have: >> summary(decideTests_eset_b_tmm) >> [,1] >> -1 2855 >> 0 12346 >> 1 4928 >> >> if I use: >> design <- model.matrix(~ a + b + a:b , data=targets) >> >> then: >> summary(decideTests_eset_b_tmm) >> [,1] >> -1 3343 >> 0 9490 >> 1 4191 > > You haven't actually told us which coefficient you're testing for. > >> When having more than one factor, is it more appropriate to have the >> interaction factor in the design? > > Do you expect the effect of experimental factor b to be same for each > level of a? If yes, then maybe you don't need the interaction term. It > depends on your experiment and on the questions you want to ask. > >> Thanks a lot >> Best, >> >> Fernando > > BTW, I would much prefer it if you would post questions about edgeR to the > main Bioconductor mailing list rather than to bioc-sig-sequencing. The > questions relate more to the general problem of analysing gene expression > experiments rather than to details of particular sequencing technologies. > > Best wishes > Gordon > > --------------------------------------------- > Professor Gordon K Smyth, > Bioinformatics Division, > Walter and Eliza Hall Institute of Medical Research, > 1G Royal Parade, Parkville, Vic 3052, Australia. > Tel: (03) 9345 2326, Fax (03) 9347 0852, > smyth at wehi.edu.au > http://www.wehi.edu.au > http://www.statsci.org/smyth ______________________________________________________________________ The information in this email is confidential and intend...{{dropped:4}}
ADD COMMENT

Login before adding your answer.

Traffic: 893 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6