I am having trouble understanding why my resultsNames are missing comparisons that I thought should be there, as well as if I should relevel for both factors in my analysis?
I have two treatments 1-Year of sample and 2-Timpoint of collection. I am conducting an LRT test.
What I want to know is, shouldn't there be a "Year_H2019_vs_H1999", "Timepoint_T24_vs_T48" , "YearH1999.TimepointT0", "YearH2019.TimepointT0", "YearH1995.TimepointT0", "YearH1995.TimepointT24", and a "YearH1995.TimepointT48", present in the results(Names)?
My second question is: I want to have comparisons based on timepoint 0, but do I also need to relevel on the "Year" factor?
Thanks for any guidance on these problems.
#Here is my metadata
metadata
Year Timepoint Group
gm10 H1999 T24 1999_T24
gm11 H1995 T24 1995_T24
gm12 H1995 T24 1995_T24
gm13 H1999 T24 1999_T24
gm14 H1999 T48 1999_T48
gm15 H2019 T48 2019_T48
gm16 H1999 T48 1999_T48
gm17 H2019 T24 2019_T24
gm18 H1995 T24 1995_T24
gm19 H1999 T0 1999_T0
gm1 H2019 T24 2019_T24
gm20 H1995 T0 1995_T0
gm21 H1999 T0 1999_T0
gm22 H1999 T0 1999_T0
gm23 H1995 T0 1995_T0
gm24 H1999 T48 1999_T48
gm25 H2019 T48 2019_T48
gm26 H1999 T48 1999_T48
gm27 H1995 T48 1995_T48
gm28 H2019 T24 2019_T24
gm29 H1999 T0 1999_T0
gm2 H2019 T0 2019_T0
gm30 H1999 T24 1999_T24
gm31 H1995 T0 1995_T0
gm32 H2019 T0 2019_T0
gm33 H1999 T24 1999_T24
gm34 H2019 T48 2019_T48
gm35 H1995 T24 1995_T24
gm36 H2019 T24 2019_T24
gm3 H1995 T0 1995_T0
gm4 H1995 T48 1995_T48
gm5 H2019 T48 2019_T48
gm6 H1995 T48 1995_T48
gm7 H2019 T0 2019_T0
gm8 H1995 T48 1995_T48
gm9 H2019 T0 2019_T0
##My model
dds<- DESeqDataSetFromMatrix(countData = mycounts, colData = metadata, design=~Year+Timepoint+Year:Timepoint)
##relevel for Timepoint 0 to be the reference
dds$Timepoint<- relevel(dds$Timepoint, ref= "T0")
dds<- DESeq(dds)
res <- results(dds)
> resultsNames(dds)
[1] "Intercept" "Year_H1999_vs_H1995" "Year_H2019_vs_H1995" "Timepoint_T24_vs_T0"
[5] "Timepoint_T48_vs_T0" "YearH1999.TimepointT24" "YearH2019.TimepointT24" "YearH1999.TimepointT48"
[9] "YearH2019.TimepointT48"
# include your problematic code here with any corresponding output
# please also include the results of running the following in an R session
> sessionInfo( )
R version 4.0.1 (2020-06-06)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)
Matrix products: default
locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252
attached base packages:
[1] parallel stats4 stats graphics grDevices utils datasets methods base
other attached packages:
[1] forcats_0.5.1 stringr_1.4.0 purrr_0.3.4
[4] readr_2.1.2 tidyr_1.2.0 tibble_3.1.6
[7] ggplot2_3.3.5 tidyverse_1.3.1 DESeq2_1.28.1
[10] SummarizedExperiment_1.18.2 DelayedArray_0.14.1 matrixStats_0.61.0
[13] Biobase_2.48.0 GenomicRanges_1.40.0 GenomeInfoDb_1.24.2
[16] IRanges_2.22.2 S4Vectors_0.26.1 BiocGenerics_0.34.0
[19] dplyr_1.0.8
loaded via a namespace (and not attached):
[1] httr_1.4.2 bit64_4.0.5 jsonlite_1.8.0 splines_4.0.1
[5] modelr_0.1.8 assertthat_0.2.1 blob_1.2.2 cellranger_1.1.0
[9] GenomeInfoDbData_1.2.3 yaml_2.2.2 pillar_1.7.0 RSQLite_2.2.10
[13] backports_1.4.1 lattice_0.20-45 glue_1.6.1 RColorBrewer_1.1-3
[17] XVector_0.28.0 rvest_1.0.2 colorspace_2.0-2 Matrix_1.4-0
[21] XML_3.99-0.8 pkgconfig_2.0.3 broom_0.7.12 haven_2.4.3
[25] genefilter_1.70.0 zlibbioc_1.34.0 xtable_1.8-4 scales_1.1.1
[29] tzdb_0.2.0 BiocParallel_1.22.0 annotate_1.66.0 generics_0.1.2
[33] ellipsis_0.3.2 withr_2.5.0 cachem_1.0.6 cli_3.1.1
[37] survival_3.2-13 magrittr_2.0.2 crayon_1.5.1 readxl_1.3.1
[41] memoise_2.0.1 fs_1.5.2 fansi_1.0.2 xml2_1.3.3
[45] tools_4.0.1 hms_1.1.1 lifecycle_1.0.1 reprex_2.0.1
[49] munsell_0.5.0 locfit_1.5-9.4 AnnotationDbi_1.50.3 compiler_4.0.1
[53] rlang_1.0.1 grid_4.0.1 RCurl_1.98-1.6 rstudioapi_0.13
[57] bitops_1.0-7 gtable_0.3.0 DBI_1.1.2 R6_2.5.1
[61] lubridate_1.8.0 fastmap_1.1.0 bit_4.0.4 utf8_1.2.2
[65] stringi_1.7.6 Rcpp_1.0.8 vctrs_0.3.8 geneplotter_1.66.0
[69] dbplyr_2.1.1 tidyselect_1.1.2
I apologize, I thought this was a software question on releveling and missing comparisons. Maybe someone else could share a link that has a good example or documentation of a two factor analysis that includes releveling? I have been trying to find thorough documentation, but haven't been able to find a good example.
In R, one can specify a particular linear model using formula and this produces a set of coefficients: this is generic to all methods in Bioconductor and beyond. So anyone who knows how to construct and interpret the design and coefficients in R would be able to help you. I try to therefore reserve my time on the support site to specific questions about DESeq2 software. There are a lot of users online with general statistical consulting questions about how to analyze their datasets and interpret results, but I just don't have time to address all of those questions about experimental design here, while also maintaining and supporting software.