Question: Matrix not full rank - DESeq2
gravatar for rbronste
4 weeks ago by
rbronste50 wrote:


Trying to do the following and getting an error of Model matrix not full rank:

sampleNames<-c("MMV1",    "MMV2",    "MMV3",    "FMV1",    "FMV2",    "FMV3", "MME7", "MME8", "MME9", "FME1", "FME2", "FME3")
sampleSex<-c("male", "male", "male", "female", "female", "female", "male", "male", "male", "female", "female", "female")
sampleTreatment<-c("vehicle", "vehicle", "vehicle", "vehicle", "vehicle", "vehicle", "BB", "BB", "BB", "BB", "BB", "BB")
colData<-data.frame(sampleName=sampleNames, sex=sampleSex, treatment=sampleTreatment, batch=sampleBatch)

se<-SummarizedExperiment(assays=list(counts=counts),rowRanges=rowRanges, colData=colData)

table(sampleSex, sampleTreatment)

dds<-DESeqDataSet(se, design= ~ batch + sex + treatment)

converting counts to integer mode
Error in checkFullRank(modelMatrix) : 
  the model matrix is not full rank, so the model cannot be fit as specified.
  One or more variables or interaction terms in the design formula are linear
  combinations of the others and must be removed.

  Please read the vignette section 'Model matrix not full rank':

> sessionInfo()
R version 3.4.1 (2017-06-30)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS Sierra 10.12.5

Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib

[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] DiffBind_2.4.8                           ggplot2_2.2.1                            DESeq2_1.16.1                           
 [4] chipenrich_2.0.1                         rtracklayer_1.36.6                       TxDb.Mmusculus.UCSC.mm10.knownGene_3.4.0
 [7]                       TxDb.Mmusculus.UCSC.mm10.ensGene_3.4.0   GenomicFeatures_1.28.5                  
[10] AnnotationDbi_1.38.2                     csaw_1.10.0                              BiocParallel_1.10.1                     
[13] SummarizedExperiment_1.6.5               DelayedArray_0.2.7                       matrixStats_0.52.2                      
[16] Biobase_2.36.2                           GenomicRanges_1.28.6                     GenomeInfoDb_1.12.3                     
[19] IRanges_2.10.5                           S4Vectors_0.14.7                         BiocGenerics_0.22.1                     
[22] BiocInstaller_1.26.1                    

loaded via a namespace (and not attached):
  [1] amap_0.8-14              TH.data_1.0-8            colorspace_1.3-2         rjson_0.2.15             hwriter_1.3.2            htmlTable_1.11.1        
  [7] XVector_0.16.0           base64enc_0.1-3          rstudioapi_0.7           MatrixModels_0.4-1       ggrepel_0.7.0            bit64_0.9-7             
 [13] mvtnorm_1.0-6         codetools_0.2-15         splines_3.4.1            chipenrich.data_2.0.0    geneplotter_1.54.0      
 [19] knitr_1.17               Formula_1.2-2            Rsamtools_1.28.0         annotate_1.54.0          cluster_2.0.6            GO.db_3.4.1             
 [25] pheatmap_1.0.8           graph_1.54.0             readr_1.1.1              compiler_3.4.1           GOstats_2.42.0           backports_1.1.2         
 [31] assertthat_0.2.0         Matrix_1.2-12            lazyeval_0.2.1           limma_3.32.10         quantreg_5.34           
 [37] acepack_1.4.1            htmltools_0.3.6          tools_3.4.1              bindrcpp_0.2             gtable_0.2.0             glue_1.2.0              
 [43] GenomeInfoDbData_0.99.0  Category_2.42.1          systemPipeR_1.10.2       dplyr_0.7.4              ShortRead_1.34.2         Rcpp_0.12.14            
 [49] Biostrings_2.44.2        nlme_3.1-131             gdata_2.18.0             stringr_1.2.0            gtools_3.5.0             XML_3.98-1.9            
 [55] polspline_1.1.12       edgeR_3.18.1             MASS_7.3-47              zoo_1.8-0                zlibbioc_1.22.0         
 [61] scales_0.5.0             hms_0.4.0                sandwich_2.4-0           RBGL_1.52.0              SparseM_1.77             RColorBrewer_1.1-2      
 [67] BBmisc_1.11              memoise_1.1.0            gridExtra_2.3            rms_5.1-1                biomaRt_2.32.1           rpart_4.1-11            
 [73] latticeExtra_0.6-28      stringi_1.1.6            RSQLite_2.0              Rhtslib_1.8.0            genefilter_1.58.1        checkmate_1.8.5         
 [79] caTools_1.17.1           rlang_0.1.6              pkgconfig_2.0.1          BatchJobs_1.7            bitops_1.0-6             lattice_0.20-35         
 [85] bindr_0.1                GenomicAlignments_1.12.2 htmlwidgets_0.9          bit_1.1-12               GSEABase_1.38.2          AnnotationForge_1.18.2  
 [91] plyr_1.8.4               magrittr_1.5             sendmailR_1.2-1          R6_2.2.2                 gplots_3.0.1             Hmisc_4.0-3             
 [97] multcomp_1.4-8           DBI_0.7                  mgcv_1.8-22              pillar_1.1.0             foreign_0.8-69           survival_2.41-3         
[103] RCurl_1.95-4.8           nnet_7.3-12              tibble_1.4.1          KernSmooth_2.23-15       locfit_1.5-9.1          
[109] grid_3.4.1               data.table_1.10.4-3      blob_1.1.0               digest_0.6.14            xtable_1.8-2             brew_1.0-6              
[115] munsell_0.4.3           

ADD COMMENTlink modified 4 weeks ago by Simon Anders3.4k • written 4 weeks ago by rbronste50

In addition this should be insightful:

> colData
   sampleName    sex treatment batch
1        MMV1   male   vehicle     1
2        MMV2   male   vehicle     2
3        MMV3   male   vehicle     3
4        FMV1 female   vehicle     1
5        FMV2 female   vehicle     2
6        FMV3 female   vehicle     3
7        MME7   male        BB     4
8        MME8   male        BB     5
9        MME9   male        BB     6
10       FME1 female        BB     4
11       FME2 female        BB     5
12       FME3 female        BB     6
ADD REPLYlink written 4 weeks ago by rbronste50
gravatar for Simon Anders
4 weeks ago by
Simon Anders3.4k
Zentrum für Molekularbiologie, Universität Heidelberg
Simon Anders3.4k wrote:

You cannot use your batch information as it is nested in treatment. Try without.

ADD COMMENTlink written 4 weeks ago by Simon Anders3.4k

So you're saying the batch information is inherently included in the design anyway right? Thats what I thought. 

ADD REPLYlink written 4 weeks ago by rbronste50

Not quite. Your batch information seems to imply that, say, sample MMV1 might be much more similar to sample FMV1 than to sample FMV2. If this is in fact the case, you'd need to account for it to maintain type-I error control. However, accounting for this requires mixed-effect models, which DESeq2 does not support. So, let's hope it works without.

ADD REPLYlink written 4 weeks ago by Simon Anders3.4k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 478 users visited in the last hour