Entering edit mode
3 43 minutes ago Rafael Soler ▴ 220
I am doing a DESeq2 comparison with different levels and one factor. To do this, I have performed the analysis in two different ways.
First, putting all the samples in the same DESeq object and then extracting each comparison:
> sampleinfo
FileName SampleName Status
A_1_count A_1 A
A_2_count A_2 A
B_3_count B_3 B
B_4_count B_4 B
C_5_count C_5 C
C_6_count C_6 C
D_7_count D_7 D
D_8_count D_8 D
E_9_count E_9 E
E_10_count E_10 E
dds <- DESeqDataSetFromMatrix(countData = cts,
colData = sampleinfo,
design = ~ Status)
dds$Status <- relevel(dds$Status, ref = "E")
And the results:
dds <- DESeq(dds)
res_A <- results(dds,name="Status_A_vs_E")
res_B <- results(dds,name="Status_B_vs_E")
res_C <- results(dds,name="Status_C_vs_E")
res_D <- results(dds,name="Status_D_vs_E")
And doing these comparisons one by one separately on different DESeq objects.
> sampleinfo_A
FileName SampleName Status
A_1_count A_1 A
A_2_count A_2 A
E_9_count E_9 E
E_10_count E_10 E
> sampleinfo_B
FileName SampleName Status
B_3_count B_3 B
B_4_count B_4 B
E_9_count E_9 E
E_10_count E_10 E
dds_A <- DESeqDataSetFromMatrix(countData = cts_A,
colData = sampleinfo_A,
design = ~ Status)
dds_B <- DESeqDataSetFromMatrix(countData = cts_B,
colData = sampleinfo_B,
design = ~ Status)
And the results:
dds_A <- DESeq(dds_A)
res_A <- results(dds_A)
dds_B <- DESeq(dds_B)
res_B <- results(dds_B)
(Repeat for each condition)
However, the results give me different between the 2 methods. Does anyone know why is this happening? How it is the correct way to compare all to E?
Thank you!
> sessionInfo( )
R version 4.0.3 (2020-10-10)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Linux Mint 20
Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0
locale:
[1] LC_CTYPE=es_ES.UTF-8 LC_NUMERIC=C LC_TIME=es_ES.UTF-8 LC_COLLATE=es_ES.UTF-8
[5] LC_MONETARY=es_ES.UTF-8 LC_MESSAGES=es_ES.UTF-8 LC_PAPER=es_ES.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=es_ES.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] parallel stats4 stats graphics grDevices utils datasets methods base
other attached packages:
[1] TeachingDemos_2.12 pheatmap_1.0.12 vsn_3.58.0 IHW_1.18.0
[5] apeglm_1.12.0 readr_2.0.1 airway_1.10.0 mosaic_1.8.3
[9] ggridges_0.5.3 mosaicData_0.20.2 ggformula_0.10.1 ggstance_0.3.5
[13] dplyr_1.0.7 Matrix_1.3-4 lattice_0.20-44 RColorBrewer_1.1-2
[17] gplots_3.1.1 ggplot2_3.3.5 DESeq2_1.30.1 SummarizedExperiment_1.20.0
[21] Biobase_2.50.0 MatrixGenerics_1.2.1 matrixStats_0.60.0 GenomicRanges_1.42.0
[25] GenomeInfoDb_1.26.7 IRanges_2.24.1 S4Vectors_0.28.1 BiocGenerics_0.36.1
loaded via a namespace (and not attached):
[1] colorspace_2.0-2 ellipsis_0.3.2 leaflet_2.0.4.1 XVector_0.30.0 ggdendro_0.1.22
[6] rstudioapi_0.13 hexbin_1.28.2 farver_2.1.0 affyio_1.60.0 ggrepel_0.9.1
[11] bit64_4.0.5 AnnotationDbi_1.52.0 fansi_0.5.0 mvtnorm_1.1-2 splines_4.0.3
[16] cachem_1.0.5 geneplotter_1.68.0 knitr_1.33 polyclip_1.10-0 broom_0.7.9
[21] annotate_1.68.0 ggforce_0.3.3 BiocManager_1.30.16 compiler_4.0.3 httr_1.4.2
[26] backports_1.2.1 assertthat_0.2.1 fastmap_1.1.0 limma_3.46.0 cli_3.0.1
[31] tweenr_1.0.2 htmltools_0.5.1.1 tools_4.0.3 affy_1.68.0 coda_0.19-4
[36] gtable_0.3.0 glue_1.4.2 GenomeInfoDbData_1.2.4 tinytex_0.33 Rcpp_1.0.7
[41] slam_0.1-48 bbmle_1.0.24 vctrs_0.3.8 preprocessCore_1.52.1 crosstalk_1.1.1
[46] xfun_0.25 stringr_1.4.0 lifecycle_1.0.0 mosaicCore_0.9.0 gtools_3.9.2
[51] XML_3.99-0.6 zlibbioc_1.36.0 MASS_7.3-54 scales_1.1.1 hms_1.1.0
[56] yaml_2.2.1 memoise_2.0.0 gridExtra_2.3 emdbook_1.3.12 bdsmatrix_1.3-4
[61] labelled_2.8.0 stringi_1.7.3 RSQLite_2.2.7 genefilter_1.72.1 caTools_1.18.2
[66] BiocParallel_1.24.1 rlang_0.4.11 pkgconfig_2.0.3 bitops_1.0-7 lpsymphony_1.18.0
[71] evaluate_0.14 purrr_0.3.4 labeling_0.4.2 htmlwidgets_1.5.3 bit_4.0.4
[76] tidyselect_1.1.1 plyr_1.8.6 magrittr_2.0.1 R6_2.5.0 generics_0.1.0
[81] DelayedArray_0.16.3 DBI_1.1.1 pillar_1.6.2 haven_2.4.3 withr_2.4.2
[86] survival_3.2-11 RCurl_1.98-1.3 tibble_3.1.3 crayon_1.4.1 fdrtool_1.2.16
[91] KernSmooth_2.23-20 utf8_1.2.2 tzdb_0.1.2 rmarkdown_2.10 locfit_1.5-9.4
[96] grid_4.0.3 blob_1.2.2 forcats_0.5.1 digest_0.6.27 xtable_1.8-4
[101] tidyr_1.1.3 numDeriv_2016.8-1.1 munsell_0.5.0
Thank you Michael! Regarding my comparison, I have a high variability within the groups, would you recommend comparing group by group separately and not in the same DESeq object? ![enter image description here][1]
This choice is up to you. I see that MN is a lot more closely spread than HGG for example, so I would lean toward pairwise.
Thank you! :)