Question

DEseq2 nested multi factor no replicates

0

Entering edit mode

melissa.mageroy • 0

@1d5adbcf

Last seen 2.9 years ago

Norway

Hi, I have been reading the previous post about nested and multi factor analysis with DESeq2 , as well as, the vignettes dealing with nested and multi factor analysis. However, I am still getting the "model matrix is not full rank" error.

Here are my experimental design and code that I have been trying to use.

The experiment was done in spruce with 6 clones and 4 treatments (CI, CW, MI, and MW). As it was spruce, we only had two trees for all but one clone. Therefore, two treatments were performed in the same individual.

I think that the individual trees should be nested. I am interested in looking at the effect of treatment and the effect of clone on gene expression.

I have tried using "ind.n" then creating a model matrix, removing zeros, and feeding the matrix to full in DEseq, but this doesn't work. Any other suggestions?

Thanks! Melissa

Here is my coldata.

> coldata
                        tree clone treatment ind.n
Lib19_157_C1_24_Intact     1  c157        CI     1
Lib23_157_MJ1_24_Intact    2  c157        MI     2
Lib27_150_C1_24_Intact     3  c150        CI     3
Lib31_150_MJ1_24_Intact    4  c150        MI     4
Lib35_449_C1_24_Intact     5  c449        CI     5
Lib39_449_MJ1_24_Intact    6  c449        MI     6
Lib43_471_C1_24_Intact     7  c471        CI     7
Lib47_471_MJ1_24_Intact    8  c471        MI     8
Lib51_095_C1_24_Intact     9   c95        CI     9
Lib55_095_MJ1_24_Intact   10   c95        MI    10
Lib59_137_C1_24_Intact    11  c137        CI    11
Lib63_137_MJ1_24_Intact   12  c137        MI    12
Lib67_137_C2_24_Intact    13  c137        CI    13
Lib71_137_MJ2_24_Intact   14  c137        MI    14
Lib75_137_C3_24_Intact    15  c137        CI    15
Lib79_137_MJ3_24_Intact   16  c137        MI    16
Lib20_157_C1_24_Wound      1  c157        CW     1
Lib24_157_MJ1_24_Wound     2  c157        MW     2
Lib28_150_C1_24_Wound      3  c150        CW     3
Lib32_150_MJ1_24_Wound     4  c150        MW     4
Lib36_449_C1_24_Wound      5  c449        CW     5
Lib40_449_MJ1_24_Wound     6  c449        MW     6
Lib44_471_C1_24_Wound      7  c471        CW     7
Lib48_471_MJ1_24_Wound     8  c471        MW     8
Lib52_095_C1_24_Wound      9   c95        CW     9
Lib56_095_MJ1_24_Wound    10   c95        MW    10
Lib60_137_C1_24_Wound     11  c137        CW    11
Lib64_137_MJ1_24_Wound    12  c137        MW    12
Lib68_137_C2_24_Wound     13  c137        CW    13
Lib72_137_MJ2_24_Wound    14  c137        MW    14
Lib76_137_C3_24_Wound     15  c137        CW    15
Lib80_137_MJ3_24_Wound    16  c137        MW    16

Here is the code

dds <- DESeqDataSetFromMatrix(countData = cts,
                              colData = coldata,
                              design = ~ treatment)

m1<- model.matrix(~ clone + clone:ind.n + clone:treatment, coldata)
colnames(m1)
unname(m1)
all.zero <- apply(m1, 2, function(x) all(x==0))
all.zero
idx <- which(all.zero)
m1 <- m1[,-idx]
unname(m1)

keep <- rowSums(counts(dds)) >= 10
dds <- dds[keep,]

DEseq <- DESeq(dds, full = m1)


sessionInfo( )
R version 3.5.1 (2018-07-02)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 16299)

Matrix products: default

locale:
[1] LC_COLLATE=Norwegian Bokmål_Norway.1252  LC_CTYPE=Norwegian Bokmål_Norway.1252    LC_MONETARY=Norwegian Bokmål_Norway.1252
[4] LC_NUMERIC=C                             LC_TIME=Norwegian Bokmål_Norway.1252    

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] writexl_1.2                 cowplot_1.0.0               bc3net_1.0.4                lattice_0.20-35            
 [5] Matrix_1.2-14               infotheo_1.2.0              c3net_1.1.1                 igraph_1.2.5               
 [9] extrafontdb_1.0             ashr_2.2-47                 clValid_0.6-6               parallelDist_0.2.4         
[13] RColorBrewer_1.1-2          pheatmap_1.0.12             forcats_0.5.0               stringr_1.4.0              
[17] purrr_0.3.3                 readr_1.3.1                 tidyr_1.0.2                 tibble_2.1.3               
[21] ggplot2_3.3.0               tidyverse_1.3.0             NMF_0.22.0                  cluster_2.0.7-1            
[25] rngtools_1.5                pkgmaker_0.31.1             registry_0.5-1              dplyr_0.8.5                
[29] DESeq2_1.22.2               SummarizedExperiment_1.12.0 DelayedArray_0.8.0          BiocParallel_1.16.6        
[33] matrixStats_0.55.0          Biobase_2.42.0              GenomicRanges_1.34.0        GenomeInfoDb_1.18.2        
[37] IRanges_2.16.0              S4Vectors_0.20.1            BiocGenerics_0.28.0         gplots_3.0.3               
[41] readxl_1.3.1                edgeR_3.24.3                limma_3.38.3               

loaded via a namespace (and not attached):
  [1] backports_1.1.5        Hmisc_4.4-0            plyr_1.8.6             splines_3.5.1          gridBase_0.4-7        
  [6] digest_0.6.25          invgamma_1.1           foreach_1.5.0          htmltools_0.4.0        SQUAREM_2020.2        
 [11] gdata_2.18.0           fansi_0.4.1            magrittr_1.5           checkmate_2.0.0        memoise_1.1.0         
 [16] doParallel_1.0.15      annotate_1.60.1        modelr_0.1.6           RcppParallel_5.0.0     colorspace_1.4-1      
 [21] blob_1.2.1             rvest_0.3.5            haven_2.2.0            xfun_0.12              crayon_1.3.4          
 [26] RCurl_1.98-1.1         jsonlite_1.6.1         genefilter_1.64.0      survival_2.42-3        iterators_1.0.12      
 [31] glue_1.3.2             gtable_0.3.0           zlibbioc_1.28.0        XVector_0.22.0         scales_1.1.0          
 [36] DBI_1.1.0              bibtex_0.4.2.2         Rcpp_1.0.4             xtable_1.8-4           htmlTable_1.13.3      
 [41] foreign_0.8-76         bit_1.1-15.2           Formula_1.2-3          truncnorm_1.0-8        htmlwidgets_1.5.1     
 [46] httr_1.4.1             acepack_1.4.1          pkgconfig_2.0.3        XML_3.99-0.3           farver_2.0.3          
 [51] nnet_7.3-12            dbplyr_1.4.2           locfit_1.5-9.1         utf8_1.1.4             tidyselect_1.0.0      
 [56] labeling_0.3           rlang_0.4.5            reshape2_1.4.3         AnnotationDbi_1.44.0   munsell_0.5.0         
 [61] cellranger_1.1.0       tools_3.5.1            cli_2.0.2              generics_0.0.2         RSQLite_2.2.0         
 [66] broom_0.5.5            yaml_2.2.1             knitr_1.28             bit64_0.9-7            fs_1.3.2              
 [71] caTools_1.17.1.3       nlme_3.1-137           xml2_1.2.5             compiler_3.5.1         rstudioapi_0.11       
 [76] reprex_0.3.0           geneplotter_1.60.0     stringi_1.4.6          vctrs_0.2.4            pillar_1.4.3          
 [81] lifecycle_0.2.0        BiocManager_1.30.10    data.table_1.12.8      bitops_1.0-6           irlba_2.3.3           
 [86] R6_2.4.1               latticeExtra_0.6-28    KernSmooth_2.23-16     gridExtra_2.3          codetools_0.2-16      
 [91] gtools_3.8.1           assertthat_0.2.1       withr_2.1.2            GenomeInfoDbData_1.2.0 hms_0.5.3             
 [96] grid_3.5.1             rpart_4.1-13           class_7.3-15           mixsqp_0.3-17          lubridate_1.7.4       
[101] base64enc_0.1-3        tinytex_0.21

using supplied model matrix Error in checkFullRank(full) : the model matrix is not full rank, so the model cannot be fit as specified. One or more variables or interaction terms in the design formula are linear combinations of the others and must be removed.

Please read the vignette section 'Model matrix not full rank':

vignette('DESeq2')

DESeq2 • 541 views

ADD COMMENT • link 3.0 years ago • updated 2.9 years ago melissa.mageroy • 0

score 0 · Answer 1 · 2021-05-08

0

Entering edit mode

Michael Love 41k

@mikelove

Last seen 18 hours ago

United States

If you label "intact" and "wounded" as a variable group, you could use the strategy in the vignette, to estimate the "M" vs "C" treatment effect, and accounting for individual tree baselines. You can also compare the M vs C effect across intact and wounded (see vignette). But you can't compare across clones, because you are treating the tree as a nuisance term. I think you could possibly compare across clone using a different framework: limma's duplicateCorrelation.