Search
Question: DESeq2 multifactorial formula
0
gravatar for Ugo Borello
4.0 years ago by
Ugo Borello340
France
Ugo Borello340 wrote:
Good morning, I am trying to run DESeq2 with this design formula design <- (~batch+sex+tissue) This is what I do from a count matrix: >library(DESeq2) >countTable<- read.table('matrix.txt', header=TRUE, row.names=1) >ConDesign<- data.frame(row.names = colnames(countTable), batch = factor(c("1", "2", "3", "1", "2", "3")), sex = factor(c("F", "F", "M", "F", "F", "M")), tissue = factor(c("Cx", "Cx", "Cx", "BrS", "BrS", "BrS"))) >ConDesign batch sex tissue Cx_1 1 F Cx Cx_2 2 F Cx Cx_3 3 M Cx BrS_1 1 F BrS BrS_2 2 F BrS BrS_3 3 M BrS When I run >xx<- model.matrix(~batch+sex+tissue, ConDesign) I get >xx (Intercept) batch2 batch3 sexM tissueBrS Cx_1 1 0 0 0 0 Cx_2 1 1 0 0 0 Cx_3 1 0 1 1 0 BrS_1 1 0 0 0 1 BrS_2 1 1 0 0 1 BrS_3 1 0 1 1 1 But when I run: >dse<- DESeqDataSetFromMatrix(countData= countTable, colData = ConDesign, design = (~batch+sex+tissue)) I get this error message: Error in validObject(.Object) : invalid class ?DESeqDataSet? object: the model matrix is not full rank, i.e. one or more variables in the design formula are linear combinations of the others Where is my mistake? Thank you for your help, Ugo > sessionInfo() R version 3.0.1 (2013-05-16) Platform: x86_64-apple-darwin10.8.0 (64-bit) locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] parallel stats graphics grDevices utils datasets methods base other attached packages: [1] org.Hs.eg.db_2.9.0 RSQLite_0.11.4 DBI_0.2-7 AnnotationDbi_1.22.6 [5] DESeq2_1.0.19 RcppArmadillo_0.3.910.0 Rcpp_0.10.4 lattice_0.20-23 [9] Biobase_2.20.1 GenomicRanges_1.12.5 IRanges_1.18.3 BiocGenerics_0.6.0 loaded via a namespace (and not attached): [1] annotate_1.38.0 genefilter_1.42.0 grid_3.0.1 locfit_1.5-9.1 RColorBrewer_1.0-5 [6] splines_3.0.1 stats4_3.0.1 survival_2.37-4 tools_3.0.1 XML_3.95-0.2 [11] xtable_1.7-1
ADD COMMENTlink modified 4.0 years ago by Michael Love14k • written 4.0 years ago by Ugo Borello340
1
gravatar for Michael Love
4.0 years ago by
Michael Love14k
United States
Michael Love14k wrote:
hi Ugo, The problem with the experimental design is that all the males are in batch 3, so you can't separate the effect of these two. There are many ways to name this problem: linearly dependent covariates, rank deficient design matrix, etc. My suggestion would be to remove the sex variable. This would then control for differences due to batch (and in the case of batch 3, absorbing the male effect). You cannot test for significance of the male effect anyway, because you wouldn't be able to tell apart the male effect from the batch 3 effect. hope this helps, Mike On Wed, Nov 20, 2013 at 10:03 AM, Ugo Borello <ugo.borello@inserm.fr> wrote: > Good morning, > > I am trying to run DESeq2 with this design formula > design <- (~batch+sex+tissue) > > This is what I do from a count matrix: > >library(DESeq2) > > >countTable<- read.table('matrix.txt', header=TRUE, row.names=1) > > > >ConDesign<- data.frame(row.names = colnames(countTable), > batch = factor(c("1", "2", "3", "1", "2", "3")), > sex = factor(c("F", "F", "M", "F", "F", "M")), > tissue = factor(c("Cx", "Cx", "Cx", > "BrS", "BrS", "BrS"))) > > >ConDesign > batch sex tissue > Cx_1 1 F Cx > Cx_2 2 F Cx > Cx_3 3 M Cx > BrS_1 1 F BrS > BrS_2 2 F BrS > BrS_3 3 M BrS > > > When I run > >xx<- model.matrix(~batch+sex+tissue, ConDesign) > > I get > >xx > (Intercept) batch2 batch3 sexM tissueBrS > Cx_1 1 0 0 0 0 > Cx_2 1 1 0 0 0 > Cx_3 1 0 1 1 0 > BrS_1 1 0 0 0 1 > BrS_2 1 1 0 0 1 > BrS_3 1 0 1 1 1 > > But when I run: > >dse<- DESeqDataSetFromMatrix(countData= countTable, > colData = ConDesign, > design = (~batch+sex+tissue)) > > > > I get this error message: > Error in validObject(.Object) : > invalid class ³DESeqDataSet² object: the model matrix is not full rank, > i.e. one or more variables in the design formula are linear combinations of > the others > > Where is my mistake? > > Thank you for your help, > Ugo > > > > sessionInfo() > R version 3.0.1 (2013-05-16) > Platform: x86_64-apple-darwin10.8.0 (64-bit) > > locale: > [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 > > attached base packages: > [1] parallel stats graphics grDevices utils datasets methods > base > > other attached packages: > [1] org.Hs.eg.db_2.9.0 RSQLite_0.11.4 DBI_0.2-7 > AnnotationDbi_1.22.6 > [5] DESeq2_1.0.19 RcppArmadillo_0.3.910.0 Rcpp_0.10.4 > lattice_0.20-23 > [9] Biobase_2.20.1 GenomicRanges_1.12.5 IRanges_1.18.3 > BiocGenerics_0.6.0 > > loaded via a namespace (and not attached): > [1] annotate_1.38.0 genefilter_1.42.0 grid_3.0.1 > locfit_1.5-9.1 > RColorBrewer_1.0-5 > [6] splines_3.0.1 stats4_3.0.1 survival_2.37-4 tools_3.0.1 > XML_3.95-0.2 > [11] xtable_1.7-1 > > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > [[alternative HTML version deleted]]
ADD COMMENTlink written 4.0 years ago by Michael Love14k
Thank you Ugo From: Michael Love <michaelisaiahlove@gmail.com> Date: Wed, 20 Nov 2013 10:21:10 -0500 To: Ugo Borello <ugo.borello@inserm.fr> Cc: "bioconductor@r-project.org" <bioconductor@r-project.org> Subject: Re: [BioC] DESeq2 multifactorial formula hi Ugo, The problem with the experimental design is that all the males are in batch 3, so you can't separate the effect of these two. There are many ways to name this problem: linearly dependent covariates, rank deficient design matrix, etc. My suggestion would be to remove the sex variable. This would then control for differences due to batch (and in the case of batch 3, absorbing the male effect). ?You cannot test for significance of the male effect anyway, because you wouldn't be able to tell apart the male effect from the batch 3 effect. hope this helps, Mike On Wed, Nov 20, 2013 at 10:03 AM, Ugo Borello <ugo.borello@inserm.fr> wrote: > Good morning, > > I am trying to run DESeq2 with this design formula > design <- (~batch+sex+tissue) > > This is what I do from a count matrix: >> >library(DESeq2) > >> >countTable<- read.table('matrix.txt', header=TRUE, row.names=1) > > >> >ConDesign<- data.frame(row.names = colnames(countTable), > ? ? ? ? ? ? ? ? ? ? ? ?batch = factor(c("1", "2", "3", "1", "2", "3")), > ? ? ? ? ? ? ? ? ? ? ? ?sex = factor(c("F", "F", "M", "F", "F", "M")), > ? ? ? ? ? ? ? ? ? ? ? ?tissue = factor(c("Cx", "Cx", "Cx", > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? "BrS", "BrS", "BrS"))) > >> >ConDesign > ? ? ? ? ? ? ? ?batch sex ?tissue > Cx_1 ? ? ? ?1 ? ? ? F ? ? Cx > Cx_2 ? ? ? ?2 ? ? ? F ? ? Cx > Cx_3 ? ? ? ?3 ? ? ? M ? ?Cx > BrS_1 ? ? ? 1 ? ? ? F ? ?BrS > BrS_2 ? ? ? 2 ? ? ? F ? ?BrS > BrS_3 ? ? ? 3 ? ? ? M ? BrS > > > When I run >> >xx<- model.matrix(~batch+sex+tissue, ConDesign) > > I get >> >xx > ? ? ? ? ? ? ? ? (Intercept) batch2 batch3 sexM tissueBrS > Cx_1 ? ? ? ? ? ? ? ? ? ? ? 1 ? ? ?0 ? ? ?0 ? ? ? ? ?0 ? ? ? ? ?0 > Cx_2 ? ? ? ? ? ? ? ? ? ? ? 1 ? ? ?1 ? ? ?0 ? ? ? ? ?0 ? ? ? ? ?0 > Cx_3 ? ? ? ? ? ? ? ? ? ? ? 1 ? ? ?0 ? ? ?1 ? ? ? ? ?1 ? ? ? ? ?0 > BrS_1 ? ? ? ? ? ? ? ? ? ? ?1 ? ? ?0 ? ? ?0 ? ? ? ? ?0 ? ? ? ? ?1 > BrS_2 ? ? ? ? ? ? ? ? ? ? ?1 ? ? ?1 ? ? ?0 ? ? ? ? ?0 ? ? ? ? ?1 > BrS_3 ? ? ? ? ? ? ? ? ? ? ?1 ? ? ?0 ? ? ?1 ? ? ? ? ?1 ? ? ? ? ?1 > > But when I run: >> >dse<- DESeqDataSetFromMatrix(countData= countTable, > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?colData = ConDesign, > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?design = ?(~batch+sex+tissue)) > > > > I get this error message: > Error in validObject(.Object) : > ? invalid class ��DESeqDataSet�� object: the model matrix is not full rank, > i.e. one or more variables in the design formula are linear combinations of > the others > > Where is my mistake? > > Thank you for your help, > Ugo > > >> > sessionInfo() > R version 3.0.1 (2013-05-16) > Platform: x86_64-apple-darwin10.8.0 (64-bit) > > locale: > [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 > > attached base packages: > [1] parallel ?stats ? ? graphics ?grDevices utils ? ? datasets ?methods > base > > other attached packages: > ?[1] org.Hs.eg.db_2.9.0 ? ? ?RSQLite_0.11.4 ? ? ? ? ?DBI_0.2-7 > AnnotationDbi_1.22.6 > ?[5] DESeq2_1.0.19 ? ? ? ? ? RcppArmadillo_0.3.910.0 Rcpp_0.10.4 > lattice_0.20-23 > ?[9] Biobase_2.20.1 ? ? ? ? ?GenomicRanges_1.12.5 ? ?IRanges_1.18.3 > BiocGenerics_0.6.0 > > loaded via a namespace (and not attached): > ?[1] annotate_1.38.0 ? ?genefilter_1.42.0 ?grid_3.0.1 ? ? ? ? locfit_1.5-9.1 > RColorBrewer_1.0-5 > ?[6] splines_3.0.1 ? ? ?stats4_3.0.1 ? ? ? survival_2.37-4 ? ?tools_3.0.1 > XML_3.95-0.2 > [11] xtable_1.7-1 > > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor [[alternative HTML version deleted]]
ADD REPLYlink written 4.0 years ago by Ugo Borello340
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 89 users visited in the last hour