Question

working with two factors in DESeq to analyze RNA-seq

0

Entering edit mode

Biase, Fernando ▴ 150

@biase-fernando-4475

Last seen 9.7 years ago

Dear Dr Anders, Based on my readings I am willing to use DESeq to analyze my RNA-seq data. I found it very easy to follow your instructions written on the package vignette, and to work with it for one factor comparison. However, I am not sure if I am using it correctly to account for a second factor in the model. So, here are my two questions: How do I compare two groups, accounting for a second factor in the model? (it is not a factorial model that I am looking for) Can I simply write: The condition names as "Ax", "Ax", "Ax", "Ax", "Ay", "Ay", "Ay", "Bx", "Bx", "Bx", "By", "By", "By" nbinomTest ( cds, A, B) being A/B may main factor and x/y my secondary factor? That is what I understood from the page 3 of the vignette > conds <- c( "T", "T", "T", "Tb", "N", "N" ) where \T" stands for a sample derived from a certain tumour type and \N" for a sample derived from non-pathological tissue. The rst three samples had a very similar histopathological phe- notype, while the fourth sample was atypical, and hence, we assign it another condition (\Tb"). Can I test for interaction effect between two factors? Thanks so much, Fernando Research Associate - PhD ---------------------------------------------------------------------- - University of Illinois at Urbana-Champaign Institute for Genomic Biology Principal Investigator: Prof Dr Harris Lewin 210 Edward R. Madigan Laboratory 1201 W. Gregory Drive Urbana, IL 61801 Phone: (217) 244-6745 E-mail: biase@illinois.edu ---------------------------------------------------------------------- - [[alternative HTML version deleted]]

DESeq ASSIGN DESeq ASSIGN • 1.3k views

ADD COMMENT • link updated 13.2 years ago by Gordon Smyth 50k • written 13.2 years ago by Biase, Fernando ▴ 150

score 0 · Answer 1 · 2011-02-09

Dear Fernando On 02/08/2011 06:02 PM, Biase, Fernando wrote: > Dear Dr Anders, > > Based on my readings I am willing to use DESeq to analyze my RNA-seq > data. I found it very easy to follow your instructions written on the > package vignette, and to work with it for one factor comparison. > However, I am not sure if I am using it correctly to account for a > second factor in the model. So, here are my two questions: > > How do I compare two groups, accounting for a second factor in the > model? (it is not a factorial model that I am looking for) > > Can I simply write: The condition names as "Ax", "Ax", "Ax", "Ax", > "Ay", "Ay", "Ay", "Bx", "Bx", "Bx", "By", "By", "By" > > nbinomTest ( cds, ?A?, ?B?) > > being A/B may main factor and x/y my secondary factor? [...] Not quite. You need to use the new GLM functionality that I added to DESeq in the newest release (current version: 1.2.1). First, when calling 'newCountDataSet', pass for 'condition' not a vector but a data frame with the design, i.e., something like this: design <- data.frame( treatment = c( "A", "A", "A", "A", "A", "A", "A", "B", "B", "B", "B", "B", "B" ), block = c( "x", "x", "x", "x", "y", "y", "y", "x", "x", "x", "y", "y", "y" ) ) cds <- newCountDataSet( countTable, design ) Then, estimate the size factors and the variance functions, the later using the new "pooled" mode: cds <- estimateSizeFactors( cds) cds <- estimateVarianceFunctions( cds, method="pooled" ) Now, you can fit two models, a null model, which only account for your blocking factor, and the full model, which also takes into account your factor of interest: fit0 <- nbinomFitGLM( cds, count ~ block ) fit1 <- nbinomFitGLM( cds, count ~ block + treatment ) The fitting might take a few minutes. Once it is finished, you get your p values with a chi-squared test: pvals <- nbinomGLMTest( fit1, fit0 ) These p values inform you whether the treatment has a significant effect on the counts, after the effect of the blocking factor has been taken into account. Remember to adjust the p values for multiple testing before you look for hits: padj <- p.adjust( pvals, method="BH" ) If your count table had gene symbols in the row names, you can now list your hits: rownames(counts(cds))[ padj < .1 ] This new functionality is not yet properly documented in the vignette because I have not yet found a public data set that would be suitable to demonstrate it. Hence, as it has not been used that widely, I'd be interested to hear how well it works for you. Best regards Simon

score 0 · Answer 2 · 2011-02-10

Dear Fernando, If you're willing read yet another vignette, Section 11 of the edgeR vignette http://bioconductor.org/packages/2.7/bioc/vignettes/edgeR/inst/doc/edg eR.pdf gives a case study explaining how to do exactly what you ask, comparing two groups while accounting for a second factor. You can include any number of factors, test for interactions, and so on. Best wishes Gordon > Date: Tue, 8 Feb 2011 11:02:35 -0600 > From: "Biase, Fernando" <biase at="" illinois.edu=""> > To: "bioconductor at r-project.org" <bioconductor at="" r-project.org=""> > Subject: [BioC] working with two factors in DESeq to analyze RNA-seq > > Dear Dr Anders, > > Based on my readings I am willing to use DESeq to analyze my RNA-seq > data. I found it very easy to follow your instructions written on the > package vignette, and to work with it for one factor comparison. > However, I am not sure if I am using it correctly to account for a > second factor in the model. So, here are my two questions: > > How do I compare two groups, accounting for a second factor in the > model? (it is not a factorial model that I am looking for) > > Can I simply write: > The condition names as > "Ax", "Ax", "Ax", "Ax", "Ay", "Ay", "Ay", "Bx", "Bx", "Bx", "By", "By", "By" > > nbinomTest ( cds, "A", "B") > > being A/B may main factor and x/y my secondary factor? > > That is what I understood from the page 3 of the vignette > > "> conds <- c( "T", "T", "T", "Tb", "N", "N" ) > where \T" stands for a sample derived from a certain tumour type and \N" for a sample derived > from non-pathological tissue. The > rst three samples had a very similar histopathological phe- > notype, while the fourth sample was atypical, and hence, we assign it another condition (\Tb"). > " > > Can I test for interaction effect between two factors? > > Thanks so much, > Fernando > > Research Associate - PhD > -------------------------------------------------------------------- --- > University of Illinois at Urbana-Champaign > Institute for Genomic Biology > Principal Investigator: Prof Dr Harris Lewin > > 210 Edward R. Madigan Laboratory > 1201 W. Gregory Drive > Urbana, IL 61801 > Phone: (217) 244-6745 E-mail: biase at illinois.edu > -------------------------------------------------------------------- --- ______________________________________________________________________ The information in this email is confidential and intend...{{dropped:4}}