DESeq2 variance stabilizing transformation: why is assay() different from getVarianceStabilizedData()?
Entering edit mode
Last seen 5 weeks ago

I have a DESeqDataSet object from my experiment, and I am trying to get a matrix of variance-stabilized values for some downstream analysis. The way I understand it, I can do it in two ways:

  1. Run varianceStabilizingTransformation() on the DESeqDataSet object, producing a DESeqTransform object, then use assay() to extract the matrix of normalized values from the DESeqTransform object
  2. Run getVarianceStabilizedData() on the DESeqDataSet object, which returns the matrix of normalized values without intermediate outputs

However, I noticed that the two methods yield slightly different matrices. Why is there a difference? Which one should I use? I suspect varianceStabilizingTransformation() and getVarianceStabilizedData() may use slightly different default parameters, but I am not sure how I could check that.

Here is a minimal working example:

dds <- makeExampleDESeqDataSet(m=6)
dds_deseq <- DESeq(dds)
method_one <- assay(varianceStabilizingTransformation(dds_deseq))
method_two <- getVarianceStabilizedData(dds_deseq)

# sample1  sample2  sample3  sample4  sample5  sample6 
# 4.844203 4.579323 5.181564 4.600407 4.833217 5.279748 
# sample1  sample2  sample3  sample4  sample5  sample6 
# 5.024666 4.780600 5.336939 4.799998 5.014525 5.428178

And here is my sessioninfo()

> sessionInfo()
R version 4.0.2 (2020-06-22)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19041)

Matrix products: default

[1] LC_COLLATE=English_United Kingdom.1252 
[2] LC_CTYPE=English_United Kingdom.1252   
[3] LC_MONETARY=English_United Kingdom.1252
[4] LC_NUMERIC=C                           
[5] LC_TIME=English_United Kingdom.1252    

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices
[6] utils     datasets  methods   base     

other attached packages:
 [1] DESeq2_1.30.0               SummarizedExperiment_1.20.0
 [3] Biobase_2.50.0              MatrixGenerics_1.2.0       
 [5] matrixStats_0.57.0          GenomicRanges_1.42.0       
 [7] GenomeInfoDb_1.26.1         IRanges_2.24.0             
 [9] S4Vectors_0.28.0            BiocGenerics_0.36.0        

loaded via a namespace (and not attached):
 [1] httr_1.4.2               splines_4.0.2           
 [3] bit64_4.0.5              assertthat_0.2.1        
 [5] askpass_1.1              BiocManager_1.30.10     
 [7] BiocFileCache_1.14.0     blob_1.2.1              
 [9] BSgenome_1.58.0          GenomeInfoDbData_1.2.4  
[11] Rsamtools_2.6.0          progress_1.2.2          
[13] pillar_1.4.7             RSQLite_2.2.1           
[15] lattice_0.20-41          glue_1.4.2              
[17] digest_0.6.27            RColorBrewer_1.1-2      
[19] XVector_0.30.0           colorspace_2.0-0        
[21] Matrix_1.2-18            XML_3.99-0.5            
[23] pkgconfig_2.0.3          biomaRt_2.46.0          
[25] genefilter_1.72.0        zlibbioc_1.36.0         
[27] purrr_0.3.4              xtable_1.8-4            
[29] GO.db_3.12.1             scales_1.1.1            
[31] BiocParallel_1.24.1      tibble_3.0.4            
[33] openssl_1.4.3            annotate_1.68.0         
[35] KEGGREST_1.30.1          ggplot2_3.3.2           
[37] generics_0.1.0           ellipsis_0.3.1          
[39] GenomicFeatures_1.42.1   survival_3.2-7          
[41] magrittr_2.0.1           crayon_1.3.4            
[43] memoise_1.1.0            xml2_1.3.2              
[45] graph_1.68.0             tools_4.0.2             
[47] prettyunits_1.1.1        hms_0.5.3               
[49] lifecycle_0.2.0          gage_2.40.0             
[51] stringr_1.4.0            locfit_1.5-9.4          
[53] munsell_0.5.0            DelayedArray_0.16.0     
[55] genbankr_1.18.0          AnnotationDbi_1.52.0    
[57] Biostrings_2.58.0        compiler_4.0.2          
[59] tinytex_0.27             rlang_0.4.9             
[61] grid_4.0.2               RCurl_1.98-1.2          
[63] rstudioapi_0.13          VariantAnnotation_1.36.0
[65] rappdirs_0.3.1           bitops_1.0-6            
[67] gtable_0.3.0             DBI_1.1.0               
[69] curl_4.3                 R6_2.5.0                
[71] GenomicAlignments_1.26.0 dplyr_1.0.2             
[73] rtracklayer_1.50.0       bit_4.0.4               
[75] stringi_1.5.3            Rcpp_1.0.5              
[77] geneplotter_1.68.0       vctrs_0.3.5             
[79] png_0.1-7                dbplyr_2.0.0            
[81] tidyselect_1.1.0         xfun_0.19
DESeq2 • 91 views
Entering edit mode
ATpoint • 390
Last seen 5 hours ago

If you call getVarianceStabilizedData directly it bypasses the part of varianceStabilizingTransformation that uses the blind=TRUE default, so you are comparing the blind and unblind transformation with regard to the experimental design. Type varianceStabilizingTransformation to see the source code.

dds <- makeExampleDESeqDataSet(m=6)

dds_deseq <- DESeq(dds)
method_one <- assay(varianceStabilizingTransformation(dds_deseq)) # blind=TRUE is default here
method_two <- getVarianceStabilizedData(dds_deseq)
method_three <- assay(varianceStabilizingTransformation(dds_deseq, blind = FALSE))


identical(method_one,method_two) # FALSE
identical(method_two,method_three) # TRUE
Entering edit mode

Ok that makes sense. Thank you!


Login before adding your answer.

Similar Posts
Loading Similar Posts
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.3