I have a DESeqDataSet object from my experiment, and I am trying to get a matrix of variance-stabilized values for some downstream analysis. The way I understand it, I can do it in two ways:
- Run varianceStabilizingTransformation() on the DESeqDataSet object, producing a DESeqTransform object, then use assay() to extract the matrix of normalized values from the DESeqTransform object
- Run getVarianceStabilizedData() on the DESeqDataSet object, which returns the matrix of normalized values without intermediate outputs
However, I noticed that the two methods yield slightly different matrices. Why is there a difference? Which one should I use? I suspect varianceStabilizingTransformation() and getVarianceStabilizedData() may use slightly different default parameters, but I am not sure how I could check that.
Here is a minimal working example:
library(DESeq2)
dds <- makeExampleDESeqDataSet(m=6)
dds_deseq <- DESeq(dds)
method_one <- assay(varianceStabilizingTransformation(dds_deseq))
method_two <- getVarianceStabilizedData(dds_deseq)
identical(method_one,method_two)
#false
method_one[1,]
# sample1 sample2 sample3 sample4 sample5 sample6
# 4.844203 4.579323 5.181564 4.600407 4.833217 5.279748
method_two[1,]
# sample1 sample2 sample3 sample4 sample5 sample6
# 5.024666 4.780600 5.336939 4.799998 5.014525 5.428178
And here is my sessioninfo()
> sessionInfo()
R version 4.0.2 (2020-06-22)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19041)
Matrix products: default
locale:
[1] LC_COLLATE=English_United Kingdom.1252
[2] LC_CTYPE=English_United Kingdom.1252
[3] LC_MONETARY=English_United Kingdom.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United Kingdom.1252
attached base packages:
[1] parallel stats4 stats graphics grDevices
[6] utils datasets methods base
other attached packages:
[1] DESeq2_1.30.0 SummarizedExperiment_1.20.0
[3] Biobase_2.50.0 MatrixGenerics_1.2.0
[5] matrixStats_0.57.0 GenomicRanges_1.42.0
[7] GenomeInfoDb_1.26.1 IRanges_2.24.0
[9] S4Vectors_0.28.0 BiocGenerics_0.36.0
loaded via a namespace (and not attached):
[1] httr_1.4.2 splines_4.0.2
[3] bit64_4.0.5 assertthat_0.2.1
[5] askpass_1.1 BiocManager_1.30.10
[7] BiocFileCache_1.14.0 blob_1.2.1
[9] BSgenome_1.58.0 GenomeInfoDbData_1.2.4
[11] Rsamtools_2.6.0 progress_1.2.2
[13] pillar_1.4.7 RSQLite_2.2.1
[15] lattice_0.20-41 glue_1.4.2
[17] digest_0.6.27 RColorBrewer_1.1-2
[19] XVector_0.30.0 colorspace_2.0-0
[21] Matrix_1.2-18 XML_3.99-0.5
[23] pkgconfig_2.0.3 biomaRt_2.46.0
[25] genefilter_1.72.0 zlibbioc_1.36.0
[27] purrr_0.3.4 xtable_1.8-4
[29] GO.db_3.12.1 scales_1.1.1
[31] BiocParallel_1.24.1 tibble_3.0.4
[33] openssl_1.4.3 annotate_1.68.0
[35] KEGGREST_1.30.1 ggplot2_3.3.2
[37] generics_0.1.0 ellipsis_0.3.1
[39] GenomicFeatures_1.42.1 survival_3.2-7
[41] magrittr_2.0.1 crayon_1.3.4
[43] memoise_1.1.0 xml2_1.3.2
[45] graph_1.68.0 tools_4.0.2
[47] prettyunits_1.1.1 hms_0.5.3
[49] lifecycle_0.2.0 gage_2.40.0
[51] stringr_1.4.0 locfit_1.5-9.4
[53] munsell_0.5.0 DelayedArray_0.16.0
[55] genbankr_1.18.0 AnnotationDbi_1.52.0
[57] Biostrings_2.58.0 compiler_4.0.2
[59] tinytex_0.27 rlang_0.4.9
[61] grid_4.0.2 RCurl_1.98-1.2
[63] rstudioapi_0.13 VariantAnnotation_1.36.0
[65] rappdirs_0.3.1 bitops_1.0-6
[67] gtable_0.3.0 DBI_1.1.0
[69] curl_4.3 R6_2.5.0
[71] GenomicAlignments_1.26.0 dplyr_1.0.2
[73] rtracklayer_1.50.0 bit_4.0.4
[75] stringi_1.5.3 Rcpp_1.0.5
[77] geneplotter_1.68.0 vctrs_0.3.5
[79] png_0.1-7 dbplyr_2.0.0
[81] tidyselect_1.1.0 xfun_0.19
Ok that makes sense. Thank you!