Hi Michael,
I have a question regarding covariates in DESeq2 modelling. I have an RNAseq dataset comparing a treatment response between healthy (n = 9) and sick (n = 35) human donors and am interested in the difference in treatment response across these groups. I have included "donor" in the design to control for differences in the baseline expression across these donors and can extract the treatment effect using the design:
dds <- DESeqDataSetFromMatrix(countData = cts,
colData = coldata,
design = ~ Donor + Treatment)
results(dds, name="Treatment_I_vs_C")
However, I was wondering if there was a way I could also correct for differences in the comorbidities/previous treatments between donors. For example, some of the sick group have been treated with steroids which is known to affect the treatment response. None of the healthy group have had this treatment. Is there a way I can specifically test for differences between healthy and sick groups while controlling for both donor-specific effects and the potential confounding effect of steroid treatment?
I tried the following design but get an error ...
dds <- DESeqDataSetFromMatrix(countData = cts,
colData = coldata,
design = ~ Donor + Treatment + dexa)
Error in checkFullRank(modelMatrix) : the model matrix is not full rank, so the model cannot be fit as specified
Since the steroid treatment is only present in the sick group, is it even possible to separate the effects of our treatment of interest across groups (healthy vs sick) from the effects of steroid treatment?
According to eigencor plot tool from PCAtools, there seems to be a significant correlation between steroid treatment and PC1 so probably something that should be corrected for?
Thanks in advance for any help with this!
sessionInfo( )
R version 4.4.1 (2024-06-14)
Platform: aarch64-apple-darwin20
Running under: macOS Ventura 13.5.1
Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/lib/libRlapack.dylib; LAPACK version 3.12.0
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
time zone: Europe/Dublin
tzcode source: internal
attached base packages:
[1] grid stats4 stats graphics grDevices utils datasets methods base
other attached packages:
[1] ggpubr_0.6.0 gplots_3.1.3.1 VennDiagram_1.7.3 futile.logger_1.4.3 RColorBrewer_1.1-3
[6] cowplot_1.1.3 colorspace_2.1-0 cluster_2.1.6 factoextra_1.0.7 png_0.1-8
[11] magick_2.8.3 circlize_0.4.16 ComplexHeatmap_2.20.0 PCAtools_2.16.0 ggrepel_0.9.5
[16] AnnotationDbi_1.66.0 enrichplot_1.24.0 clusterProfiler_4.12.0 biomaRt_2.60.1 UpSetR_1.4.0
[21] DESeq2_1.44.0 SummarizedExperiment_1.34.0 Biobase_2.64.0 MatrixGenerics_1.16.0 matrixStats_1.3.0
[26] GenomicRanges_1.56.1 GenomeInfoDb_1.40.1 IRanges_2.38.1 S4Vectors_0.42.1 BiocGenerics_0.50.0
[31] pheatmap_1.0.12 lubridate_1.9.3 forcats_1.0.0 stringr_1.5.1 dplyr_1.1.4
[36] purrr_1.0.2 readr_2.1.5 tidyr_1.3.1 tibble_3.2.1 tidyverse_2.0.0
[41] DEGreport_1.40.1 variancePartition_1.34.0 BiocParallel_1.38.0 limma_3.60.3 ggplot2_3.5.1
loaded via a namespace (and not attached):
[1] igraph_2.0.3 devtools_2.4.5 zlibbioc_1.50.0 tidyselect_1.2.1 bit_4.0.5
[6] doParallel_1.0.17 clue_0.3-65 lattice_0.22-6 rjson_0.2.21 blob_1.2.4
[11] urlchecker_1.0.1 S4Arrays_1.4.1 pbkrtest_0.5.3 parallel_4.4.1 cli_3.6.3
[16] ggplotify_0.1.2 shadowtext_0.1.3 curl_5.2.1 mime_0.12 evaluate_0.24.0
[21] tidytree_0.4.6 stringi_1.8.4 backports_1.5.0 lmerTest_3.1-3 httpuv_1.6.15
[26] magrittr_2.0.3 rappdirs_0.3.3 splines_4.4.1 ggraph_2.2.1 sessioninfo_1.2.2
[31] logging_0.10-108 DBI_1.2.3 withr_3.0.0 corpcor_1.6.10 xgboost_1.7.7.1
[36] tidygraph_1.3.1 formatR_1.14 BiocManager_1.30.23 htmlwidgets_1.6.4 fs_1.6.4
[41] labeling_0.4.3 fANCOVA_0.6-1 SparseArray_1.4.8 XVector_0.44.0 knitr_1.48
[46] UCSC.utils_1.0.0 RhpcBLASctl_0.23-42 timechange_0.3.0 foreach_1.5.2 fansi_1.0.6
[51] patchwork_1.2.0 caTools_1.18.2 data.table_1.15.4 ggtree_3.12.0 psych_2.4.6.26
[56] irlba_2.3.5.1 gridGraphics_0.5-1 ellipsis_0.3.2 lazyeval_0.2.2 yaml_2.3.9
[61] crayon_1.5.3 tweenr_2.0.3 later_1.3.2 codetools_0.2-20 GlobalOptions_0.1.2
[66] profvis_0.3.8 aod_1.3.3 KEGGREST_1.44.1 shape_1.4.6.1 filelock_1.0.3
[71] pkgconfig_2.0.3 xml2_1.3.6 EnvStats_2.8.1 aplot_0.2.3 ape_5.8
[76] viridisLite_0.4.2 xtable_1.8-4 car_3.1-2 plyr_1.8.9 httr_1.4.7
[81] rbibutils_2.2.16 tools_4.4.1 pkgbuild_1.4.4 broom_1.0.6 nlme_3.1-165
[86] lambda.r_1.2.4 HDO.db_0.99.1 dbplyr_2.5.0 lme4_1.1-35.5 digest_0.6.36
[91] numDeriv_2016.8-1.1 Matrix_1.7-0 farver_2.1.2 tzdb_0.4.0 remaCor_0.0.18
[96] reshape2_1.4.4 yulab.utils_0.1.4 viridis_0.6.5 glue_1.7.0 cachem_1.1.0
[101] BiocFileCache_2.12.0 polyclip_1.10-6 generics_0.1.3 Biostrings_2.72.1 mvtnorm_1.2-5
[106] ConsensusClusterPlus_1.68.0 pkgload_1.4.0 mnormt_2.1.1 statmod_1.5.0 ScaledMatrix_1.12.0
[111] carData_3.0-5 minqa_1.2.7 httr2_1.0.1 gson_0.1.0 dqrng_0.4.1
[116] utf8_1.2.4 graphlayouts_1.1.1 gtools_3.9.5 ggsignif_0.6.4 gridExtra_2.3
[121] shiny_1.8.1.1 GenomeInfoDbData_1.2.12 memoise_2.0.1 rmarkdown_2.27 scales_1.3.0
[126] reshape_0.8.9 rstudioapi_0.16.0 hms_1.1.3 munsell_0.5.1 rlang_1.1.4
[131] ggdendro_0.2.0 DelayedMatrixStats_1.26.0 sparseMatrixStats_1.16.0 ggforce_0.4.2 mgcv_1.9-1
[136] xfun_0.45 remotes_2.5.0 iterators_1.0.14 abind_1.4-5 GOSemSim_2.30.0
[141] treeio_1.28.0 futile.options_1.0.1 bitops_1.0-7 Rdpack_2.6 promises_1.3.0
[146] scatterpie_0.2.3 RSQLite_2.3.7 qvalue_2.36.0 fgsea_1.30.0 DelayedArray_0.30.1
[151] pbdZMQ_0.3-11 GO.db_3.19.1 compiler_4.4.1 prettyunits_1.2.0 boot_1.3-30
[156] beachmat_2.20.0 Rcpp_1.0.12 edgeR_4.2.0 BiocSingular_1.20.0 usethis_2.2.3
[161] MASS_7.3-61 progress_1.2.3 R6_2.5.1 fastmap_1.2.0 fastmatch_1.1-4
[166] rstatix_0.7.2 rsvd_1.0.5 gtable_0.3.5 KernSmooth_2.23-24 miniUI_0.1.1.1
[171] htmltools_0.5.8.1 bit64_4.0.5 lifecycle_1.0.4 nloptr_2.1.1 vctrs_0.6.5
[176] DOSE_3.30.1 ggfun_0.1.5 pillar_1.9.0 locfit_1.5-9.10 BiocStyle_2.32.1
[181] jsonlite_1.8.8 GetoptLong_1.0.5
>
Thanks, this nested design is explained very clearly here.
Much appreciated!