Question

DESeq2 with multiple variable give me different results

2

Entering edit mode

rafaelsolersanblas ▴ 50

@rafaelsolersanblas-22935

Last seen 4.2 years ago

Alicante

3 43 minutes ago Rafael Soler ▴ 220

I am doing a DESeq2 comparison with different levels and one factor. To do this, I have performed the analysis in two different ways.

First, putting all the samples in the same DESeq object and then extracting each comparison:

> sampleinfo
    FileName    SampleName  Status
    A_1_count   A_1 A       
    A_2_count   A_2 A       
    B_3_count   B_3 B       
    B_4_count   B_4 B       
    C_5_count   C_5 C   
    C_6_count   C_6 C   
    D_7_count   D_7 D   
    D_8_count   D_8 D   
    E_9_count   E_9 E   
    E_10_count  E_10    E

dds <- DESeqDataSetFromMatrix(countData = cts,
                                colData = sampleinfo,
                              design = ~ Status)

dds$Status <- relevel(dds$Status, ref = "E")

And the results:

dds <- DESeq(dds)
res_A <- results(dds,name="Status_A_vs_E")
res_B <- results(dds,name="Status_B_vs_E")
res_C <- results(dds,name="Status_C_vs_E")
res_D <- results(dds,name="Status_D_vs_E")

And doing these comparisons one by one separately on different DESeq objects.

> sampleinfo_A
    FileName    SampleName  Status
    A_1_count   A_1 A       
    A_2_count   A_2 A       
    E_9_count   E_9 E   
    E_10_count  E_10    E

> sampleinfo_B
    FileName    SampleName  Status
    B_3_count   B_3 B       
    B_4_count   B_4 B       
    E_9_count   E_9 E   
    E_10_count  E_10    E

dds_A <- DESeqDataSetFromMatrix(countData = cts_A,
                                colData = sampleinfo_A,
                              design = ~ Status)

dds_B <- DESeqDataSetFromMatrix(countData = cts_B,
                                colData = sampleinfo_B,
                              design = ~ Status)

And the results:

dds_A <- DESeq(dds_A)
res_A <- results(dds_A)

dds_B <- DESeq(dds_B)
res_B <- results(dds_B)

(Repeat for each condition)

However, the results give me different between the 2 methods. Does anyone know why is this happening? How it is the correct way to compare all to E?

Thank you!

> sessionInfo( )
R version 4.0.3 (2020-10-10)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Linux Mint 20

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0

locale:
 [1] LC_CTYPE=es_ES.UTF-8       LC_NUMERIC=C               LC_TIME=es_ES.UTF-8        LC_COLLATE=es_ES.UTF-8    
 [5] LC_MONETARY=es_ES.UTF-8    LC_MESSAGES=es_ES.UTF-8    LC_PAPER=es_ES.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=es_ES.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] TeachingDemos_2.12          pheatmap_1.0.12             vsn_3.58.0                  IHW_1.18.0                 
 [5] apeglm_1.12.0               readr_2.0.1                 airway_1.10.0               mosaic_1.8.3               
 [9] ggridges_0.5.3              mosaicData_0.20.2           ggformula_0.10.1            ggstance_0.3.5             
[13] dplyr_1.0.7                 Matrix_1.3-4                lattice_0.20-44             RColorBrewer_1.1-2         
[17] gplots_3.1.1                ggplot2_3.3.5               DESeq2_1.30.1               SummarizedExperiment_1.20.0
[21] Biobase_2.50.0              MatrixGenerics_1.2.1        matrixStats_0.60.0          GenomicRanges_1.42.0       
[25] GenomeInfoDb_1.26.7         IRanges_2.24.1              S4Vectors_0.28.1            BiocGenerics_0.36.1        

loaded via a namespace (and not attached):
  [1] colorspace_2.0-2       ellipsis_0.3.2         leaflet_2.0.4.1        XVector_0.30.0         ggdendro_0.1.22       
  [6] rstudioapi_0.13        hexbin_1.28.2          farver_2.1.0           affyio_1.60.0          ggrepel_0.9.1         
 [11] bit64_4.0.5            AnnotationDbi_1.52.0   fansi_0.5.0            mvtnorm_1.1-2          splines_4.0.3         
 [16] cachem_1.0.5           geneplotter_1.68.0     knitr_1.33             polyclip_1.10-0        broom_0.7.9           
 [21] annotate_1.68.0        ggforce_0.3.3          BiocManager_1.30.16    compiler_4.0.3         httr_1.4.2            
 [26] backports_1.2.1        assertthat_0.2.1       fastmap_1.1.0          limma_3.46.0           cli_3.0.1             
 [31] tweenr_1.0.2           htmltools_0.5.1.1      tools_4.0.3            affy_1.68.0            coda_0.19-4           
 [36] gtable_0.3.0           glue_1.4.2             GenomeInfoDbData_1.2.4 tinytex_0.33           Rcpp_1.0.7            
 [41] slam_0.1-48            bbmle_1.0.24           vctrs_0.3.8            preprocessCore_1.52.1  crosstalk_1.1.1       
 [46] xfun_0.25              stringr_1.4.0          lifecycle_1.0.0        mosaicCore_0.9.0       gtools_3.9.2          
 [51] XML_3.99-0.6           zlibbioc_1.36.0        MASS_7.3-54            scales_1.1.1           hms_1.1.0             
 [56] yaml_2.2.1             memoise_2.0.0          gridExtra_2.3          emdbook_1.3.12         bdsmatrix_1.3-4       
 [61] labelled_2.8.0         stringi_1.7.3          RSQLite_2.2.7          genefilter_1.72.1      caTools_1.18.2        
 [66] BiocParallel_1.24.1    rlang_0.4.11           pkgconfig_2.0.3        bitops_1.0-7           lpsymphony_1.18.0     
 [71] evaluate_0.14          purrr_0.3.4            labeling_0.4.2         htmlwidgets_1.5.3      bit_4.0.4             
 [76] tidyselect_1.1.1       plyr_1.8.6             magrittr_2.0.1         R6_2.5.0               generics_0.1.0        
 [81] DelayedArray_0.16.3    DBI_1.1.1              pillar_1.6.2           haven_2.4.3            withr_2.4.2           
 [86] survival_3.2-11        RCurl_1.98-1.3         tibble_3.1.3           crayon_1.4.1           fdrtool_1.2.16        
 [91] KernSmooth_2.23-20     utf8_1.2.2             tzdb_0.1.2             rmarkdown_2.10         locfit_1.5-9.4        
 [96] grid_4.0.3             blob_1.2.2             forcats_0.5.1          digest_0.6.27          xtable_1.8-4          
[101] tidyr_1.1.3            numDeriv_2016.8-1.1    munsell_0.5.0

Factor Levels Dispersionestimates DESeq2 • 1.3k views

ADD COMMENT • link 4.2 years ago rafaelsolersanblas ▴ 50

score 2 · Accepted Answer · 2021-11-02

2

Entering edit mode

Michael Love 43k

@mikelove

Last seen 46 minutes ago

United States

It's a FAQ in the vignette.

ADD COMMENT • link 4.2 years ago Michael Love 43k

0

Entering edit mode

Thank you Michael! Regarding my comparison, I have a high variability within the groups, would you recommend comparing group by group separately and not in the same DESeq object? ![enter image description here][1]

ADD REPLY • link 4.2 years ago rafaelsolersanblas ▴ 50

1

Entering edit mode

This choice is up to you. I see that MN is a lot more closely spread than HGG for example, so I would lean toward pairwise.