Hello,
Ran into this with setting up a design and thought it was interesting (and I can't figure out why). We have an RNA-Seq setup that we'd like to do some basic dose-response analysis (i.e., does expression change with increasing dosage of a drug?). We have a numeric column, Dose
, that we use for this now. The thought was to change that to an ordered factor which better represents the data (or not). I created a new column as an ordered factor, DoseFactor
, for this purpose, which contains the same information, just stored as a factor instead of a numeric.
Dose DoseFactor <dbl> <ord> 1 0.1 0.1 2 0 0 3 0.03 0.03 4 0.3 0.3 5 0 0 6 0.1 0.1 7 0.1 0.1 8 0.1 0.1 9 0.03 0.03 10 0.3 0.3
If I create the dataset using the numeric column, all is fine. However, with the factor column, I get this message:
DESeqDataSetFromMatrix(countData = countMatrix.1061, colData = samplesubset.1061, design = ~ DoseFactor)
Error in DESeqDataSet(se, design = design, ignoreRank) :
design contains one or more variables with all samples having the same value,
remove these variables from the design
I would really appreciate anyone being able to shed light on this, or point me to the error in my thinking about this variable as a factor.
Thanks!
---
> sessionInfo()
R version 3.4.4 (2018-03-15) Platform: x86_64-apple-darwin15.6.0 (64-bit) Running under: macOS High Sierra 10.13.4 Matrix products: default BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib LAPACK: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] parallel stats4 stats graphics grDevices utils datasets methods base other attached packages: [1] bindrcpp_0.2.2 reshape_0.8.7 geneplotter_1.56.0 lattice_0.20-35 [5] gridExtra_2.3 ggthemes_3.5.0 forcats_0.3.0 stringr_1.3.1 [9] purrr_0.2.4 readr_1.1.1 tidyr_0.8.1 tibble_1.4.2 [13] tidyverse_1.2.1 RColorBrewer_1.1-2 DT_0.4 plotly_4.7.1 [17] ggplot2_2.2.1.9000 Glimma_1.6.0 readxl_1.1.0 gplots_3.0.1 [21] fgsea_1.4.1 Rcpp_0.12.17 GSEABase_1.40.1 graph_1.56.0 [25] annotate_1.56.2 XML_3.98-1.11 AnnotationDbi_1.40.0 DESeq2_1.18.1 [29] SummarizedExperiment_1.8.1 DelayedArray_0.4.1 matrixStats_0.53.1 Biobase_2.38.0 [33] GenomicRanges_1.30.3 GenomeInfoDb_1.14.0 IRanges_2.12.0 S4Vectors_0.16.0 [37] BiocGenerics_0.24.0 pheatmap_1.0.10 magrittr_1.5 dplyr_0.7.5 [41] plyr_1.8.4 biomaRt_2.34.2 loaded via a namespace (and not attached): [1] colorspace_1.3-2 rsconnect_0.8.8 rprojroot_1.3-2 htmlTable_1.11.2 XVector_0.18.0 [6] base64enc_0.1-3 rstudioapi_0.7 bit64_0.9-7 lubridate_1.7.4 xml2_1.2.0 [11] splines_3.4.4 mnormt_1.5-5 knitr_1.20 Formula_1.2-3 jsonlite_1.5 [16] broom_0.4.4 cluster_2.0.7-1 shiny_1.1.0 compiler_3.4.4 httr_1.3.1 [21] backports_1.1.2 assertthat_0.2.0 Matrix_1.2-14 lazyeval_0.2.1 cli_1.0.0 [26] limma_3.34.9 later_0.7.2 acepack_1.4.1 htmltools_0.3.6 prettyunits_1.0.2 [31] tools_3.4.4 gtable_0.2.0 glue_1.2.0 GenomeInfoDbData_1.0.0 reshape2_1.4.3 [36] fastmatch_1.1-0 cellranger_1.1.0 gdata_2.18.0 nlme_3.1-137 crosstalk_1.0.0 [41] psych_1.8.4 rvest_0.3.2 mime_0.5 devtools_1.13.5 gtools_3.5.0 [46] edgeR_3.20.9 zlibbioc_1.24.0 scales_0.5.0 promises_1.0.1 hms_0.4.2 [51] yaml_2.1.19 memoise_1.1.0 rpart_4.1-13 latticeExtra_0.6-28 stringi_1.2.2 [56] RSQLite_2.1.1 genefilter_1.60.0 checkmate_1.8.5 caTools_1.17.1 BiocParallel_1.12.0 [61] rlang_0.2.0 pkgconfig_2.0.1 bitops_1.0-6 evaluate_0.10.1 bindr_0.1.1 [66] labeling_0.3 htmlwidgets_1.2 bit_1.1-13 tidyselect_0.2.4 R6_2.2.2 [71] Hmisc_4.1-1 DBI_1.0.0 pillar_1.2.2 haven_1.1.1 foreign_0.8-70 [76] withr_2.1.2 survival_2.42-3 RCurl_1.95-4.10 nnet_7.3-12 crayon_1.3.4 [81] modelr_0.1.2 utf8_1.1.3 KernSmooth_2.23-15 rmarkdown_1.9 progress_1.1.2 [86] locfit_1.5-9.1 grid_3.4.4 data.table_1.10.4-3 blob_1.1.1 digest_0.6.15 [91] xtable_1.8-2 httpuv_1.4.3 munsell_0.4.3 viridisLite_0.3.0
Thinking out loud, I'm wondering if it might be simpler to just use a combined factor representing multiple conditions and then do contrasts with known information (group D is higher dosage than group A, for example).