How to use DESeq results with LRT vs. with Wald's test
1
0
Entering edit mode
@502c354e
Last seen 5 months ago
Austria

Dear Community,

I am new to the field of RNA-Seq and wanted to ask for advice concerning the proper use of the DESeq() function.

Briefly, my experimental set up is as follows: I have 30 mice in 5 different treatment groups (n=6/group) and a known batch effect (sequencing was performed at two different days, batch effect confirmed visually via PCA & distance matrix after using limma::removeBatchEffect on the vst-transformed matrix, as referenced in your vignette (Why after VST are there still batches in the PCA plot?)). According to your vignette (Quick start), i have defined my design matrix like this:

ddsMat <- DESeqDataSetFromMatrix(countData = countdata,
                             colData  = coldata,
                             design = ~ seq_day + group) #known batch = sequencing days

After reading your vignette (Likelihood ratio test) and several blog entries in this forum, I have run the DESeq() function in two different ways to explore the differences:

dds.LRT <- DESeq(ddsFilteredMat, test="LRT", reduced=~seq_day) #LRT
dds<- DESeq(ddsFilteredMat) #Wald

As I understood it, the LRT version can be comparable to an ANOVA as it tests for differences across all of the groups, after removing the effect exerted by the batch effect whereas Wald's test is used for direct pairwise comparisons between 2 groups. Consequently, I get the same DEGs, LFC and pvals for any contrast extracted from dds.LRT, since it always compares this: LRT p-value: '~ seq_mice_cage + group' vs '~ seq_mice_cage'.

#extract results
res.groupA_LRT <- results(dds.LRT, contrast = c("group", "A", "control"))
res.groupB_LRT <- results(dds.LRT, contrast = c("group", "B", "control"))
res.groupC_LRT <- results(dds.LRT, contrast = c("group", "C", "control"))
res.groupD_LRT <- results(dds.LRT, contrast = c("group", "D", "control"))

#show top 6 rows per result
data.frame(rbind(cbind(head(res.groupA_LRT), group = rep("A", 6), rank = seq(1,6)), 
                            cbind(head(res.groupB_LRT), group = rep("B", 6), rank = seq(1,6)), 
                            cbind(head(res.groupC_LRT), group = rep("C", 6), rank = seq(1,6)), 
                            cbind(head(res.groupD_LRT), group = rep("D", 6), rank = seq(1,6))))

output:

                       baseMean log2FoldChange      lfcSE      stat      pvalue       padj group rank
ENSMUSG00000025902     44.22274    -0.40249365 0.35558183  8.588492 0.072249833 0.22728207     A    1
ENSMUSG00000033845   1310.06464     0.36575419 0.15746631  6.559806 0.161060931 0.36511564     A    2
ENSMUSG00000102275     16.07641     1.30656064 0.79514588  5.777069 0.216426997 0.43092779     A    3
ENSMUSG00000025903   2187.23707     0.14752514 0.14539018  7.719259 0.102420605 0.27909782     A    4
ENSMUSG00000033813    228.33252    -0.24121186 0.19922834 17.043843 0.001895401 0.01960863     A    5
ENSMUSG00000033793   1517.61889    -0.19788049 0.15639653  2.544884 0.636617125 0.77272669     A    6
ENSMUSG00000025902.1   44.22274     0.09838015 0.36613901  8.588492 0.072249833 0.22728207     B    1
ENSMUSG00000033845.1 1310.06464     0.32342520 0.16436483  6.559806 0.161060931 0.36511564     B    2
ENSMUSG00000102275.1   16.07641     2.02343897 0.82375227  5.777069 0.216426997 0.43092779     B    3
ENSMUSG00000025903.1 2187.23707     0.21157088 0.15173852  7.719259 0.102420605 0.27909782     B    4
ENSMUSG00000033813.1  228.33252     0.27254669 0.20653754 17.043843 0.001895401 0.01960863     B    5
ENSMUSG00000033793.1 1517.61889    -0.09809705 0.16316122  2.544884 0.636617125 0.77272669     B    6
ENSMUSG00000025902.2   44.22274     0.23444799 0.36404826  8.588492 0.072249833 0.22728207     C    1
ENSMUSG00000033845.2 1310.06464     0.18044040 0.16541518  6.559806 0.161060931 0.36511564     C    2
ENSMUSG00000102275.2   16.07641     1.42749218 0.81373353  5.777069 0.216426997 0.43092779     C    3
ENSMUSG00000025903.2 2187.23707     0.22914511 0.15270455  7.719259 0.102420605 0.27909782     C    4
ENSMUSG00000033813.2  228.33252    -0.08556367 0.20771886 17.043843 0.001895401 0.01960863     C    5
ENSMUSG00000033793.2 1517.61889    -0.06247866 0.16435798  2.544884 0.636617125 0.77272669     C    6
ENSMUSG00000025902.3   44.22274    -0.16301925 0.23952672  8.588492 0.072249833 0.22728207     D    1
ENSMUSG00000033845.3 1310.06464     0.02632379 0.10753635  6.559806 0.161060931 0.36511564     D    2
ENSMUSG00000102275.3   16.07641     0.62498726 0.49334192  5.777069 0.216426997 0.43092779     D    3
ENSMUSG00000025903.3 2187.23707     0.25842491 0.09947068  7.719259 0.102420605 0.27909782     D    4
ENSMUSG00000033813.3  228.33252    -0.22232786 0.13551622 17.043843 0.001895401 0.01960863     D    5
ENSMUSG00000033793.3 1517.61889    -0.05636333 0.10730159  2.544884 0.636617125 0.77272669     D    6

I have defined my contrasts from the Wald-dds object as follows:

res.groupA_Wald <- results(dds, name = "group_A_vs_control")
res.groupB_Wald <- results(dds, name = "group_B_vs_control")
res.groupC_Wald <- results(dds, name = "group_C_vs_control")
res.groupD_Wald <- results(dds, name = "group_D_vs_control")

(i) How can I put these two dds results to use in my downstream analyses / what are the reasons for me to choose the LRT output for my set up (5 groups; 4 treatments vs. 1 control)? I want to do heatmaps of expression levels of the top 30 DEG, a Venn diagram and enrichment analyses (ORA/GSEA).

(ii) Tying in with the "ANOVA-comparison", is it valid to see the Wald-dds object as a post-hoc test? If so, is there a way to approximate a post-hoc correction similar to Tukey's correction (i.e. less conservative as basic differences have already been confirmed via ANOVA)?

Thank you very much in advance. Best, Luise

Session info:

sessionInfo()
R version 4.3.1 (2023-06-16)
Platform: aarch64-apple-darwin20 (64-bit)
Running under: macOS Sonoma 14.0

Matrix products: default
BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib 
LAPACK: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.11.0

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

time zone: Europe/Vienna
tzcode source: internal

attached base packages:
[1] grid      stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] mixOmics_6.24.0             lattice_0.21-8              MASS_7.3-60                 EnhancedVolcano_1.18.0      ggrepel_0.9.3               ComplexHeatmap_2.16.0      
 [7] cowplot_1.1.1               circlize_0.4.15             RColorBrewer_1.1-3          ggfortify_0.4.16            org.Mm.eg.db_3.17.0         AnnotationDbi_1.62.1       
[13] clusterProfiler_4.8.1       biomaRt_2.56.1              apeglm_1.22.1               sva_3.35.2                  BiocParallel_1.34.2         genefilter_1.82.1          
[19] mgcv_1.8-42                 nlme_3.1-162                Rsamtools_2.16.0            Biostrings_2.68.1           XVector_0.40.0              tximeta_1.18.3             
[25] DESeq2_1.40.1               SummarizedExperiment_1.30.2 Biobase_2.60.0              MatrixGenerics_1.12.2       matrixStats_1.0.0           GenomicRanges_1.52.0       
[31] GenomeInfoDb_1.36.1         IRanges_2.34.0              S4Vectors_0.38.1            BiocGenerics_0.46.0         edgeR_3.42.4                limma_3.56.2               
[37] here_1.0.1                  lubridate_1.9.2             forcats_1.0.0               stringr_1.5.0               dplyr_1.1.2                 purrr_1.0.1                
[43] readr_2.1.4                 tidyr_1.3.0                 tibble_3.2.1                ggplot2_3.4.2               tidyverse_2.0.0             rio_0.5.29                 

loaded via a namespace (and not attached):
  [1] igraph_1.5.0                  graph_1.78.0                  Formula_1.2-5                 zlibbioc_1.46.0               tidyselect_1.2.0             
  [6] bit_4.0.5                     doParallel_1.0.17             clue_0.3-64                   rjson_0.2.21                  blob_1.2.4                   
 [11] S4Arrays_1.0.4                parallel_4.3.1                png_0.1-8                     cli_3.6.1                     ggplotify_0.1.0              
 [16] ProtGenerics_1.32.0           BiocIO_1.10.0                 tximport_1.28.0               shadowtext_0.1.2              curl_5.0.1                   
 [21] mime_0.12                     evaluate_0.21                 tidytree_0.4.2                stringi_1.7.12                backports_1.4.1              
 [26] XML_3.99-0.14                 httpuv_1.6.11                 magrittr_2.0.3                rappdirs_0.3.3                splines_4.3.1                
 [31] ggraph_2.1.0                  DBI_1.1.3                     reactome.db_1.84.0            withr_2.5.0                   corpcor_1.6.10               
 [36] emdbook_1.3.13                rprojroot_2.0.3               enrichplot_1.20.0             bdsmatrix_1.3-6               tidygraph_1.2.3              
 [41] rtracklayer_1.60.1            BiocManager_1.30.21           htmlwidgets_1.6.2             ggvenn_0.1.10                 labeling_0.4.2               
 [46] cellranger_1.1.0              annotate_1.78.0               reticulate_1.34.0             knitr_1.43                    timechange_0.2.0             
 [51] foreach_1.5.2                 fansi_1.0.4                   patchwork_1.1.2               data.table_1.14.8             ggtree_3.8.0                 
 [56] RSpectra_0.16-1               gridGraphics_0.5-1            ellipsis_0.3.2                lazyeval_0.2.2                yaml_2.3.7                   
 [61] survival_3.5-5                BiocVersion_3.17.1            crayon_1.5.2                  tweenr_2.0.2                  later_1.3.1                  
 [66] codetools_0.2-19              base64enc_0.1-3               GlobalOptions_0.1.2           KEGGREST_1.40.0               bbmle_1.0.25                 
 [71] shape_1.4.6                   ReactomePA_1.44.0             filelock_1.0.2                foreign_0.8-84                pkgconfig_2.0.3              
 [76] xml2_1.3.4                    GenomicAlignments_1.36.0      aplot_0.1.10                  ape_5.7-1                     viridisLite_0.4.2            
 [81] xtable_1.8-4                  fastcluster_1.2.3             plyr_1.8.8                    httr_1.4.6                    tools_4.3.1                  
 [86] htmlTable_2.4.1               checkmate_2.2.0               HDO.db_0.99.1                 dbplyr_2.3.2                  digest_0.6.31                
 [91] numDeriv_2016.8-1.1           Matrix_1.5-4.1                farver_2.1.1                  tzdb_0.4.0                    AnnotationFilter_1.24.0      
 [96] reshape2_1.4.4                yulab.utils_0.0.6             viridis_0.6.3                 rpart_4.1.19                  glue_1.6.2                   
[101] cachem_1.0.8                  BiocFileCache_2.8.0           polyclip_1.10-4               Hmisc_5.1-0                   generics_0.1.3               
[106] dynamicTreeCut_1.63-1         mvtnorm_1.2-2                 gson_0.1.0                    utf8_1.2.3                    graphlayouts_1.0.0           
[111] readxl_1.4.2                  gridExtra_2.3                 shiny_1.7.4                   GenomeInfoDbData_1.2.10       RCurl_1.98-1.12              
[116] memoise_2.0.1                 rmarkdown_2.22                pheatmap_1.0.12               downloader_0.4                scales_1.2.1                 
[121] Cairo_1.6-0                   rstudioapi_0.14               cluster_2.1.4                 hms_1.1.3                     munsell_0.5.0                
[126] colorspace_2.1-0              ellipse_0.4.5                 rlang_1.1.1                   ggforce_0.4.1                 xfun_0.39                    
[131] pacman_0.5.1                  coda_0.19-4                   iterators_1.0.14              rARPACK_0.11-0                GOSemSim_2.26.0              
[136] interactiveDisplayBase_1.38.0 treeio_1.24.1                 bitops_1.0-7                  promises_1.2.0.1              scatterpie_0.2.1             
[141] RSQLite_2.3.1                 qvalue_2.32.0                 openxlsx_4.2.5.2              fgsea_1.26.0                  DelayedArray_0.26.3          
[146] GO.db_3.17.0                  compiler_4.3.1                prettyunits_1.1.1             graphite_1.46.0               Rcpp_1.0.10                  
[151] AnnotationHub_3.8.0           progress_1.2.2                R6_2.5.1                      fastmap_1.1.1                 fastmatch_1.1-3              
[156] ensembldb_2.24.1              nnet_7.3-19                   gtable_0.3.3                  htmltools_0.5.5               bit64_4.0.5                  
[161] lifecycle_1.0.3               zip_2.3.0                     restfulr_0.0.15               vctrs_0.6.3                   DOSE_3.26.1                  
[166] haven_2.5.2                   ggfun_0.1.1                   pillar_1.9.0                  GenomicFeatures_1.52.2        magick_2.8.0                 
[171] locfit_1.5-9.8                jsonlite_1.8.5                GetoptLong_1.0.5
DESeq2 • 284 views
ADD COMMENT
0
Entering edit mode
@mikelove
Last seen 1 day ago
United States

hi,

Due to time constraints, I have to limit myself to software related questions. For questions about statistical analysis plan, I recommend to consult with a local statistician.

ADD COMMENT

Login before adding your answer.

Traffic: 381 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6