Hello,
despite reading numerous tutorials and workshops on DESeq2 package (which are of a tremendous help!), I cannot seem to be able to have the appropriate design and be able to get out the contrasts of interest.
Data:
Two groups of subjects at three time points (no missing data):
- group G (5 subjects): T0 and T2 normal diet, T4 without gluten
- group noG (7 subjects): T0 normal diet, T2 and T4 without gluten
If we put it in parallel, to make it clearer
To put it in parallel:
Time4: Ta__Tb__Tc__Td
G:____Ta__Tb__Tc__x : Ta,Tb
noG:__x__Tb__Tc__Td : Tb
where in bold are time points with normal diet for two groups (problem with visualisation so I had to note it aside).
Comparisons of interest:
- (G+noG Tb) vs (G+noG Tc) i.e. normal diet vs without gluten
- Ta vs Tb only for group G to have paired samples: see the effect of normal diet
- Tc vs Td only for group noG to have paired samples: see the long-term effect of diet w/o gluten
As recommended, I wanted to do the model on overall data, and account for between-subject variabilty.
Design proposition 1:
~ Gluten+ Gluten:Subject.n + Gluten:Time (Time: T0/T2/T4) As suggested by the tutorial, model matrix was filtered from 0 rows and provided to DESeq2 function. https://bioconductor.org/packages/release/bioc/vignettes/DESeq2/inst/doc/DESeq2.html#interactions
The design should take into account that at time T2, two groups will be different and also the intersubject variability.
resultNames output:
[1] "Intercept" "GlutennoG" "GlutenG.Subject.nb" "GlutennoG.Subject.nb" "GlutenG.Subject.nc"
[6] "GlutennoG.Subject.nc" "GlutenG.Subject.nd" "GlutennoG.Subject.nd" "GlutenG.Subject.ne" "GlutennoG.Subject.ne"
[11] "GlutennoG.Subject.nf" "GlutennoG.Subject.ng" "GlutenG.TimeT2" "GlutennoG.TimeT2" "GlutenG.TimeT4"
[16] "GlutennoG.TimeT4"
Here I can answer my questions b) and c), considering comparisons within each group and between specific time points by specifing contrast as follows:
# noG: T2vsT4
res_G.T0_T2.OTU = results (dds_G.T0_T2, cooksCutoff = FALSE, alpha = alpha, contrast = list("GlutennoG.TimeT2", "GlutennoG.TimeT4"))
However, I cannot take out the effect for the non-referent level (GlutennoG is reference level, and TimeT0), trying to do as suggested for the non-referent level. ( results function)
# G: T0vsT2
res_G.T0_T2.OTU = results (dds_G.T0_T2, cooksCutoff = FALSE, alpha = alpha, contrast = list("Time_T2_vs_T0", "GlutenG.TimeT2"))
Error in checkContrast(contrast, resNames) : all elements of the contrast as a list of length 2 should be elements of 'resultsNames(object)'
I have seen the post where it was suggested to use betaPrior = TRUE, but in interactions it is by default off, so this does not seem to be a solution (?). Also, with this design it is impossible to do grouped analysis (G_T2+noG_T0) vs (G_T4+noG_T2) (eq. last time point normal diet vs first time point noGluten).
Design proposition 2:
I tried to use the same logic but with four time points. ~ Gluten+ Gluten:Subject.n + Gluten:Time4 (Time4: Ta/Tb/Tc/Td)
even after renaming subjects and removing 0 rows, an error 'Model matrix not full rank'
Design proposition 3: Finally, the simplest design without take into consideration paired samples.
~ Subject + Time4
simple to obtain grouped analysis (G+noG Tb) vs (G+noG Tc) (eq. last time point normal diet vs first time point noGluten), however no possibility to have within group analysis for two group-specific time points (i.e. Ta vs Tb for G, and Tc vs Td for noG).
My questions:
- Is it possible to have an unique model to answer all three of my questions through contrasts?
- In any case, what should be the most appropriate design?
- How should I specify contrasts since the suggested method does not seem to work (assuming I have used it properly)?
Thank you in advance for your time and help!
R version 4.0.5 (2021-03-31)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19042)
Matrix products: default
locale:
[1] LC_COLLATE=French_France.1252 LC_CTYPE=French_France.1252 LC_MONETARY=French_France.1252 LC_NUMERIC=C
[5] LC_TIME=French_France.1252
attached base packages:
[1] parallel stats4 stats graphics grDevices utils datasets methods base
other attached packages:
[1] vsn_3.58.0 stromboli_0.1.0 randomcoloR_1.1.0.1 xlsx_0.6.5.9000
[5] EnhancedVolcano_1.9.13 ggrepel_0.9.1 microbiomeutilities_1.00.16 microbiomeSeq_0.1
[9] microbiome_1.12.0 pheatmap_1.0.12 DESeq2_1.30.1 SummarizedExperiment_1.20.0
[13] Biobase_2.50.0 MatrixGenerics_1.2.0 matrixStats_0.58.0 GenomicRanges_1.42.0
[17] GenomeInfoDb_1.26.1 IRanges_2.24.0 S4Vectors_0.28.0 BiocGenerics_0.36.0
[21] vegan_2.5-7 lattice_0.20-44 permute_0.9-5 ggpubr_0.4.0
[25] ggsci_2.9 ggplot2_3.3.3 dplyr_1.0.6 phyloseq_1.34.0
loaded via a namespace (and not attached):
[1] utf8_1.2.1 tidyselect_1.1.1 RSQLite_2.2.7 AnnotationDbi_1.52.0 grid_4.0.5
[6] BiocParallel_1.24.1 Rtsne_0.15 RNeXML_2.4.5 munsell_0.5.0 preprocessCore_1.52.1
[11] codetools_0.2-18 units_0.7-2 withr_2.4.2 colorspace_2.0-1 ggalt_0.4.0
[16] knitr_1.33 uuid_0.1-4 rstudioapi_0.13 ggsignif_0.6.1 rJava_1.0-4
[21] Rttf2pt1_1.3.8 labeling_0.4.2 GenomeInfoDbData_1.2.4 farver_2.1.0 bit64_4.0.5
[26] rhdf5_2.34.0 coda_0.19-4 LearnBayes_2.15.1 vctrs_0.3.8 generics_0.1.0
[31] xfun_0.23 adegenet_2.1.3 adephylo_1.1-11 R6_2.5.0 ggbeeswarm_0.6.0
[36] locfit_1.5-9.4 btools_0.0.1 bitops_1.0-7 rhdf5filters_1.2.0 cachem_1.0.5
[41] DelayedArray_0.16.0 assertthat_0.2.1 promises_1.2.0.1 scales_1.1.1 beeswarm_0.4.0
[46] gtable_0.3.0 phylobase_0.8.10 ash_1.0-15 affy_1.68.0 rlang_0.4.11
[51] genefilter_1.72.0 splines_4.0.5 extrafontdb_1.0 rstatix_0.7.0 lazyeval_0.2.2
[56] hexbin_1.28.2 broom_0.7.6 GUniFrac_1.2 BiocManager_1.30.15 yaml_2.2.1
[61] reshape2_1.4.4 abind_1.4-5 backports_1.2.1 httpuv_1.6.1 gghalves_0.1.1
[66] extrafont_0.17 tools_4.0.5 spData_0.3.8 affyio_1.60.0 ellipsis_0.3.2
[71] jquerylib_0.1.4 raster_3.4-10 biomformat_1.18.0 RColorBrewer_1.1-2 proxy_0.4-26
[76] Rcpp_1.0.6 plyr_1.8.6 progress_1.2.2 zlibbioc_1.36.0 classInt_0.4-3
[81] purrr_0.3.4 RCurl_1.98-1.3 prettyunits_1.1.1 deldir_0.2-10 cowplot_1.1.1
[86] haven_2.4.1 cluster_2.1.0 tinytex_0.32 magrittr_2.0.1 data.table_1.14.0
[91] openxlsx_4.2.3 gmodels_2.18.1 xlsxjars_0.6.1 hms_1.1.0 mime_0.10
[96] evaluate_0.14 xtable_1.8-4 XML_3.99-0.6 rio_0.5.26 jpeg_0.1-8.1
[101] readxl_1.3.1 gridExtra_2.3 compiler_4.0.5 maps_3.3.0 tibble_3.1.2
[106] V8_3.4.2 KernSmooth_2.23-20 crayon_1.4.1 htmltools_0.5.1.1 mgcv_1.8-35
[111] later_1.2.0 spdep_1.1-8 tidyr_1.1.3 geneplotter_1.68.0 expm_0.999-6
[116] DBI_1.1.1 proj4_1.0-10.1 MASS_7.3-53 sf_0.9-8 boot_1.3-28
[121] Matrix_1.2-18 ade4_1.7-16 car_3.0-10 cli_2.5.0 adegraphics_1.0-15
[126] gdata_2.18.0 igraph_1.2.6 forcats_0.5.1 pkgconfig_2.0.3 adespatial_0.3-14
[131] rncl_0.8.4 foreign_0.8-80 sp_1.4-5 xml2_1.3.2 foreach_1.5.1
[136] annotate_1.68.0 bslib_0.2.5.1 vipor_0.4.5 multtest_2.46.0 XVector_0.30.0
[141] stringr_1.4.0 digest_0.6.27 Biostrings_2.58.0 rmarkdown_2.8 cellranger_1.1.0
[146] curl_4.3.1 shiny_1.6.0 gtools_3.8.2 lifecycle_1.0.0 nlme_3.1-152
[151] jsonlite_1.7.2 Rhdf5lib_1.12.0 carData_3.0-4 seqinr_4.2-8 limma_3.46.0
[156] fansi_0.4.2 pillar_1.6.1 ggrastr_0.2.3 fastmap_1.1.0 httr_1.4.2
[161] survival_3.2-11 glue_1.4.2 zip_2.2.0 png_0.1-7 iterators_1.0.13
[166] bit_4.0.4 sass_0.4.0 class_7.3-19 stringi_1.5.3 blob_1.2.1
[171] latticeExtra_0.6-29 memoise_2.0.0 e1071_1.7-6 ape_5.5-1
Thank you for your answer Michael Love, as I understand that this inquiry might be out of scope of your package support assistance.
Unfortunately, as I am not in position to team up with somebody having more profound knowledge of linear models, I will try to find solution either through other posted examples or, with a bit of luck, if one of community members finds answer to this question.
All the best