Question

DESeq2 "Not full rank" and "less than full rank" error

0

Entering edit mode

Najla Abassi • 0

@c38f6201

Last seen 4 weeks ago

Germany

Hi all, I would like to use DESeq2 to analyze bulk RNA-seq data and I have the following coldata:

group                           donor                id 
wt_ctrl                         m1                   wt_ctrl_m1         
wt_treated                      m1                   wt_treated_m1  
wt_ctrl                         m2                   wt_ctrl_m2       
wt_treated                      m2                   wt_treated_m2 
ko_treated                      m3                   ko_treated_m3 
ko_ctrl                         m3                   ko_ctrl_m3      
ko_ctrl                         m4                   ko_ctrl_m4     
ko_treated                      m4                   ko_treated_m4 
ko_ctrl                         m5                   ko_ctrl_m5     
wt_treated                      m6                   wt_treated_m6 
wt_ctrl                         m6                   wt_ctrl_m6       
ko_treated                      m5                   ko_treated_m5 
ko_ctrl                         m7                   ko_ctrl_m7      
ko_treated                      m7                   ko_treated_m7 
wt_treated                      m8                   wt_treated_m8  
wt_cntrl                        m8                   wt_cntrl_m8   
wt_treated                      m9                   wt_treated_m9  
wt_ctrl                         m9                   wt_ctrl_m9  
ko_treated                      m10                  ko_treated_m10 
ko_ctrl                         m10                  ko_ctrl_m10

I wanted to perform DE analysis, so to create the dds object

dds <- DESeqDataSet(gse1, design = ~donor + group)

but I got the following error

Error in checkFullRank(modelMatrix) : 
  the model matrix is not full rank, so the model cannot be fit as specified.
  One or more variables or interaction terms in the design formula are linear
  combinations of the others and must be removed.

  Please read the vignette section 'Model matrix not full rank':

  vignette('DESeq2')

So I tried to find from the vignette a work around to include both donor and group in my design:

dds <- DESeqDataSet(gse, design = ~ donor + donor:id+ donor:group)

then when I wanted to run DESeq() I got the follwoing error:

Error in designAndArgChecker(object, betaPrior) : 
  full model matrix is less than full rank

I am a bit confused how to handle this design:

table(gse1$donor_animal,gse1$group)

             wt_ctrl        wt_treated        ko_ctrl       ko_treated
  m1               1                 1             0                0
  m10              0                 0             1                1
  m2               1                 1             0                0
  m3               0                 0             1                1
  m4               0                 0             1                1
  m5               0                 0             1                1
  m6               1                 1             0                0
  m7               0                 0             1                1
  m8               1                 1             0                0
  m9               1                 1             0                0

even when we see that the wt and ko columns are duplicated. Any ideas on how should I proceed? Thank you in advance!

sessionInfo( )
R version 4.4.1 (2024-06-14)
Platform: aarch64-apple-darwin20
Running under: macOS Ventura 13.6

Matrix products: default
BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib 
LAPACK: /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.12.0

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

time zone: Europe/Berlin
tzcode source: internal

attached base packages:
[1] stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] tximeta_1.23.2              topGO_2.57.0               
 [3] SparseM_1.84-2              GO.db_3.19.1               
 [5] graph_1.83.0                clusterProfiler_4.13.3     
 [7] biomaRt_2.61.3              org.Mm.eg.db_3.19.1        
 [9] AnnotationDbi_1.67.0        DT_0.33                    
[11] GeneTonic_2.9.0             DESeq2_1.45.3              
[13] pcaExplorer_2.31.0          limma_3.61.9               
[15] ggplot2_3.5.1               knitr_1.48                 
[17] SingleCellExperiment_1.27.2 SummarizedExperiment_1.35.1
[19] Biobase_2.65.1              GenomicRanges_1.57.1       
[21] GenomeInfoDb_1.41.1         IRanges_2.39.2             
[23] S4Vectors_0.43.2            BiocGenerics_0.51.1        
[25] MatrixGenerics_1.17.0       matrixStats_1.4.1          

loaded via a namespace (and not attached):
  [1] ProtGenerics_1.37.1      fs_1.6.4                 bitops_1.0-8            
  [4] enrichplot_1.25.2        httr_1.4.7               webshot_0.5.5           
  [7] RColorBrewer_1.1-3       doParallel_1.0.17        Rgraphviz_2.49.0        
 [10] dynamicTreeCut_1.63-1    tippy_0.1.0              tools_4.4.1             
 [13] utf8_1.2.4               R6_2.5.1                 lazyeval_0.2.2          
 [16] GetoptLong_1.0.5         withr_3.0.1              prettyunits_1.2.0       
 [19] gridExtra_2.3            cli_3.6.3                TSP_1.2-4               
 [22] scatterpie_0.2.4         labeling_0.4.3           sass_0.4.9              
 [25] bs4Dash_2.3.4            genefilter_1.87.0        ggridges_0.5.6          
 [28] Rsamtools_2.21.1         yulab.utils_0.1.7        txdbmaker_1.1.1         
 [31] gson_0.1.0               DOSE_3.99.1              R.utils_2.12.3          
 [34] AnnotationForge_1.47.1   readxl_1.4.3             rstudioapi_0.16.0       
 [37] RSQLite_2.3.7            BiocIO_1.15.2            gridGraphics_0.5-1      
 [40] visNetwork_2.1.2         generics_0.1.3           GOstats_2.71.0          
 [43] shape_1.4.6.1            crosstalk_1.2.1          dplyr_1.1.4             
 [46] dendextend_1.17.1        Matrix_1.7-0             fansi_1.0.6             
 [49] abind_1.4-8              R.methodsS3_1.8.2        lifecycle_1.0.4         
 [52] yaml_2.3.10              qvalue_2.37.0            SparseArray_1.5.36      
 [55] BiocFileCache_2.13.0     grid_4.4.1               blob_1.2.4              
 [58] promises_1.3.0           crayon_1.5.3             shinydashboard_0.7.2    
 [61] miniUI_0.1.1.1           lattice_0.22-6           cowplot_1.1.3           
 [64] ComplexUpset_1.3.3       GenomicFeatures_1.57.0   annotate_1.83.0         
 [67] KEGGREST_1.45.1          pillar_1.9.0             ComplexHeatmap_2.21.0   
 [70] fgsea_1.31.0             rjson_0.2.23             codetools_0.2-20        
 [73] fastmatch_1.1-4          glue_1.7.0               ggfun_0.1.6             
 [76] data.table_1.16.0        treeio_1.29.1            vctrs_0.6.5             
 [79] png_0.1-8                cellranger_1.1.0         gtable_0.3.5            
 [82] assertthat_0.2.1         cachem_1.1.0             xfun_0.47               
 [85] S4Arrays_1.5.7           mime_0.12                tidygraph_1.3.1         
 [88] survival_3.7-0           pheatmap_1.0.12          seriation_1.5.6         
 [91] iterators_1.0.14         statmod_1.5.0            nlme_3.1-166            
 [94] Category_2.71.0          ggtree_3.13.1            bit64_4.5.2             
 [97] threejs_0.3.3            progress_1.2.3           filelock_1.0.3          
[100] bslib_0.8.0              colorspace_2.1-1         DBI_1.2.3               
[103] tidyselect_1.2.1         bit_4.5.0                compiler_4.4.1          
[106] curl_5.2.3               httr2_1.0.4              expm_1.0-0              
[109] xml2_1.3.6               DelayedArray_0.31.11     plotly_4.10.4           
[112] rtracklayer_1.65.0       shadowtext_0.1.4         colourpicker_1.3.0      
[115] scales_1.3.0             RBGL_1.81.0              NMF_0.28                
[118] rappdirs_0.3.3           stringr_1.5.1            digest_0.6.37           
[121] shinyBS_0.61.1           rmarkdown_2.28           ca_0.71.1               
[124] XVector_0.45.0           htmltools_0.5.8.1        pkgconfig_2.0.3         
[127] base64enc_0.1-3          ensembldb_2.29.1         dbplyr_2.5.0            
[130] fastmap_1.2.0            rlang_1.1.4              GlobalOptions_0.1.2     
[133] htmlwidgets_1.6.4        UCSC.utils_1.1.0         shiny_1.9.1             
[136] farver_2.1.2             jquerylib_0.1.4          jsonlite_1.8.9          
[139] BiocParallel_1.39.0      GOSemSim_2.31.2          R.oo_1.26.0             
[142] RCurl_1.98-1.16          magrittr_2.0.3           ggplotify_0.1.2         
[145] GenomeInfoDbData_1.2.12  patchwork_1.3.0          munsell_0.5.1           
[148] Rcpp_1.0.13              ape_5.8                  shinycssloaders_1.1.0   
[151] viridis_0.6.5            stringi_1.8.4            rintrojs_0.3.4          
[154] ggraph_2.2.1             zlibbioc_1.51.1          MASS_7.3-61             
[157] AnnotationHub_3.13.3     plyr_1.8.9               parallel_4.4.1          
[160] ggrepel_0.9.6            graphlayouts_1.2.0       Biostrings_2.73.1       
[163] splines_4.4.1            hms_1.1.3                circlize_0.4.16         
[166] locfit_1.5-9.10          igraph_2.0.3             rngtools_1.5.2          
[169] pkgload_1.4.0            reshape2_1.4.4           BiocVersion_3.20.0      
[172] XML_3.99-0.17            evaluate_1.0.0           BiocManager_1.30.25     
[175] foreach_1.5.2            tweenr_2.0.3             httpuv_1.6.15           
[178] backbone_2.1.4           tidyr_1.3.1              purrr_1.0.2             
[181] polyclip_1.10-7          heatmaply_1.5.0          clue_0.3-65             
[184] gridBase_0.4-7           ggforce_0.4.2            xtable_1.8-4            
[187] AnnotationFilter_1.29.0  restfulr_0.0.15          tidytree_0.4.6          
[190] later_1.3.2              viridisLite_0.4.2        tibble_3.2.1            
[193] aplot_0.2.3              GenomicAlignments_1.41.0 memoise_2.0.1           
[196] registry_0.5-1           tximport_1.33.0          cluster_2.1.6           
[199] shinyWidgets_0.8.7       shinyAce_0.4.2           GSEABase_1.67.0

DESeq2 deseq • 764 views

ADD COMMENT • link updated 15 months ago by Guido Hooiveld ★ 4.1k • written 15 months ago by Najla Abassi • 0

score 0 · Answer 1 · 2024-11-12

According to the vignette (https://bioconductor.org/packages/devel/bioc/vignettes/DESeq2/inst/doc/DESeq2.html#model-matrix-not-full-rank) it is suggested to nest individuals within each group, so within each group you will need to start the numbering with donor 1 (you numbered the donors consecutively, but in both groups (wt and ko) there should be a donor labelled m1, m2 etc). Again, see the example in the vignette, and run that code to understand and appreciate this (= column ind.n that is added to the design, and is then used when specifying the model).

Also note that according this example you will need to 'split' the genotype and treatment information. In other words, the design should (at least) contain the columns genotype (wt/ko), treatment (ctrl/treatment) and ind.n (m1, m2, ...).