Hi, I have been reading the previous post about nested and multi factor analysis with DESeq2 , as well as, the vignettes dealing with nested and multi factor analysis. However, I am still getting the "model matrix is not full rank" error.
Here are my experimental design and code that I have been trying to use.
The experiment was done in spruce with 6 clones and 4 treatments (CI, CW, MI, and MW). As it was spruce, we only had two trees for all but one clone. Therefore, two treatments were performed in the same individual.
I think that the individual trees should be nested. I am interested in looking at the effect of treatment and the effect of clone on gene expression.
I have tried using "ind.n" then creating a model matrix, removing zeros, and feeding the matrix to full in DEseq, but this doesn't work. Any other suggestions?
Thanks! Melissa
Here is my coldata.
> coldata
tree clone treatment ind.n
Lib19_157_C1_24_Intact 1 c157 CI 1
Lib23_157_MJ1_24_Intact 2 c157 MI 2
Lib27_150_C1_24_Intact 3 c150 CI 3
Lib31_150_MJ1_24_Intact 4 c150 MI 4
Lib35_449_C1_24_Intact 5 c449 CI 5
Lib39_449_MJ1_24_Intact 6 c449 MI 6
Lib43_471_C1_24_Intact 7 c471 CI 7
Lib47_471_MJ1_24_Intact 8 c471 MI 8
Lib51_095_C1_24_Intact 9 c95 CI 9
Lib55_095_MJ1_24_Intact 10 c95 MI 10
Lib59_137_C1_24_Intact 11 c137 CI 11
Lib63_137_MJ1_24_Intact 12 c137 MI 12
Lib67_137_C2_24_Intact 13 c137 CI 13
Lib71_137_MJ2_24_Intact 14 c137 MI 14
Lib75_137_C3_24_Intact 15 c137 CI 15
Lib79_137_MJ3_24_Intact 16 c137 MI 16
Lib20_157_C1_24_Wound 1 c157 CW 1
Lib24_157_MJ1_24_Wound 2 c157 MW 2
Lib28_150_C1_24_Wound 3 c150 CW 3
Lib32_150_MJ1_24_Wound 4 c150 MW 4
Lib36_449_C1_24_Wound 5 c449 CW 5
Lib40_449_MJ1_24_Wound 6 c449 MW 6
Lib44_471_C1_24_Wound 7 c471 CW 7
Lib48_471_MJ1_24_Wound 8 c471 MW 8
Lib52_095_C1_24_Wound 9 c95 CW 9
Lib56_095_MJ1_24_Wound 10 c95 MW 10
Lib60_137_C1_24_Wound 11 c137 CW 11
Lib64_137_MJ1_24_Wound 12 c137 MW 12
Lib68_137_C2_24_Wound 13 c137 CW 13
Lib72_137_MJ2_24_Wound 14 c137 MW 14
Lib76_137_C3_24_Wound 15 c137 CW 15
Lib80_137_MJ3_24_Wound 16 c137 MW 16
Here is the code
dds <- DESeqDataSetFromMatrix(countData = cts,
colData = coldata,
design = ~ treatment)
m1<- model.matrix(~ clone + clone:ind.n + clone:treatment, coldata)
colnames(m1)
unname(m1)
all.zero <- apply(m1, 2, function(x) all(x==0))
all.zero
idx <- which(all.zero)
m1 <- m1[,-idx]
unname(m1)
keep <- rowSums(counts(dds)) >= 10
dds <- dds[keep,]
DEseq <- DESeq(dds, full = m1)
sessionInfo( )
R version 3.5.1 (2018-07-02)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 16299)
Matrix products: default
locale:
[1] LC_COLLATE=Norwegian Bokmål_Norway.1252 LC_CTYPE=Norwegian Bokmål_Norway.1252 LC_MONETARY=Norwegian Bokmål_Norway.1252
[4] LC_NUMERIC=C LC_TIME=Norwegian Bokmål_Norway.1252
attached base packages:
[1] parallel stats4 stats graphics grDevices utils datasets methods base
other attached packages:
[1] writexl_1.2 cowplot_1.0.0 bc3net_1.0.4 lattice_0.20-35
[5] Matrix_1.2-14 infotheo_1.2.0 c3net_1.1.1 igraph_1.2.5
[9] extrafontdb_1.0 ashr_2.2-47 clValid_0.6-6 parallelDist_0.2.4
[13] RColorBrewer_1.1-2 pheatmap_1.0.12 forcats_0.5.0 stringr_1.4.0
[17] purrr_0.3.3 readr_1.3.1 tidyr_1.0.2 tibble_2.1.3
[21] ggplot2_3.3.0 tidyverse_1.3.0 NMF_0.22.0 cluster_2.0.7-1
[25] rngtools_1.5 pkgmaker_0.31.1 registry_0.5-1 dplyr_0.8.5
[29] DESeq2_1.22.2 SummarizedExperiment_1.12.0 DelayedArray_0.8.0 BiocParallel_1.16.6
[33] matrixStats_0.55.0 Biobase_2.42.0 GenomicRanges_1.34.0 GenomeInfoDb_1.18.2
[37] IRanges_2.16.0 S4Vectors_0.20.1 BiocGenerics_0.28.0 gplots_3.0.3
[41] readxl_1.3.1 edgeR_3.24.3 limma_3.38.3
loaded via a namespace (and not attached):
[1] backports_1.1.5 Hmisc_4.4-0 plyr_1.8.6 splines_3.5.1 gridBase_0.4-7
[6] digest_0.6.25 invgamma_1.1 foreach_1.5.0 htmltools_0.4.0 SQUAREM_2020.2
[11] gdata_2.18.0 fansi_0.4.1 magrittr_1.5 checkmate_2.0.0 memoise_1.1.0
[16] doParallel_1.0.15 annotate_1.60.1 modelr_0.1.6 RcppParallel_5.0.0 colorspace_1.4-1
[21] blob_1.2.1 rvest_0.3.5 haven_2.2.0 xfun_0.12 crayon_1.3.4
[26] RCurl_1.98-1.1 jsonlite_1.6.1 genefilter_1.64.0 survival_2.42-3 iterators_1.0.12
[31] glue_1.3.2 gtable_0.3.0 zlibbioc_1.28.0 XVector_0.22.0 scales_1.1.0
[36] DBI_1.1.0 bibtex_0.4.2.2 Rcpp_1.0.4 xtable_1.8-4 htmlTable_1.13.3
[41] foreign_0.8-76 bit_1.1-15.2 Formula_1.2-3 truncnorm_1.0-8 htmlwidgets_1.5.1
[46] httr_1.4.1 acepack_1.4.1 pkgconfig_2.0.3 XML_3.99-0.3 farver_2.0.3
[51] nnet_7.3-12 dbplyr_1.4.2 locfit_1.5-9.1 utf8_1.1.4 tidyselect_1.0.0
[56] labeling_0.3 rlang_0.4.5 reshape2_1.4.3 AnnotationDbi_1.44.0 munsell_0.5.0
[61] cellranger_1.1.0 tools_3.5.1 cli_2.0.2 generics_0.0.2 RSQLite_2.2.0
[66] broom_0.5.5 yaml_2.2.1 knitr_1.28 bit64_0.9-7 fs_1.3.2
[71] caTools_1.17.1.3 nlme_3.1-137 xml2_1.2.5 compiler_3.5.1 rstudioapi_0.11
[76] reprex_0.3.0 geneplotter_1.60.0 stringi_1.4.6 vctrs_0.2.4 pillar_1.4.3
[81] lifecycle_0.2.0 BiocManager_1.30.10 data.table_1.12.8 bitops_1.0-6 irlba_2.3.3
[86] R6_2.4.1 latticeExtra_0.6-28 KernSmooth_2.23-16 gridExtra_2.3 codetools_0.2-16
[91] gtools_3.8.1 assertthat_0.2.1 withr_2.1.2 GenomeInfoDbData_1.2.0 hms_0.5.3
[96] grid_3.5.1 rpart_4.1-13 class_7.3-15 mixsqp_0.3-17 lubridate_1.7.4
[101] base64enc_0.1-3 tinytex_0.21
using supplied model matrix Error in checkFullRank(full) : the model matrix is not full rank, so the model cannot be fit as specified. One or more variables or interaction terms in the design formula are linear combinations of the others and must be removed.
Please read the vignette section 'Model matrix not full rank':
vignette('DESeq2')
Thanks for the suggestion!