DESeq2 rlog error
1
2
Entering edit mode
bruce.moran ▴ 30
@brucemoran-8388
Last seen 2.5 years ago
Ireland

Hi,

I have been using DESeq2 for a while, it is a good tool, never had any issues. Now I am getting an error at rlog() using 'fast' which is essential AFAIAC:

> rldss<-rlog(ddss, fast=T)
Error in rlog(ddss, fast = T) : unused argument (fast = T)

I do note that the option is removed from the documentation. If this is the case can anyone specify why? And are there other quick ways to do this transform? It is purely to plot PCA.

Appreciate any help,

Bruce.

 

> sessionInfo()
R version 3.2.2 (2015-08-14)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 14.04.3 LTS

locale:
 [1] LC_CTYPE=en_IE.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=en_IE.UTF-8        LC_COLLATE=en_IE.UTF-8
 [5] LC_MONETARY=en_IE.UTF-8    LC_MESSAGES=en_IE.UTF-8
 [7] LC_PAPER=en_IE.UTF-8       LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_IE.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets
[8] methods   base

other attached packages:
 [1] ggplot2_1.0.1              genefilter_1.52.0
 [3] DESeq2_1.10.0              RcppArmadillo_0.6.100.0.0
 [5] Rcpp_0.12.1                SummarizedExperiment_1.0.0
 [7] Biobase_2.30.0             GenomicRanges_1.22.0
 [9] GenomeInfoDb_1.6.0         IRanges_2.4.1
[11] S4Vectors_0.8.0            BiocGenerics_0.16.0
[13] BiocInstaller_1.20.0

loaded via a namespace (and not attached):
 [1] RColorBrewer_1.1-2   futile.logger_1.4.1  plyr_1.8.3
 [4] XVector_0.10.0       futile.options_1.0.0 tools_3.2.2
 [7] zlibbioc_1.16.0      rpart_4.1-10         digest_0.6.8
[10] RSQLite_1.0.0        annotate_1.48.0      gtable_0.1.2
[13] lattice_0.20-33      DBI_0.3.1            proto_0.3-10
[16] gridExtra_2.0.0      cluster_2.0.3        stringr_1.0.0
[19] locfit_1.5-9.1       nnet_7.3-11          grid_3.2.2
[22] AnnotationDbi_1.32.0 XML_3.98-1.3         survival_2.38-3
[25] BiocParallel_1.4.0   foreign_0.8-66       latticeExtra_0.6-26
[28] Formula_1.2-1        geneplotter_1.48.0   reshape2_1.4.1
[31] lambda.r_1.1.7       magrittr_1.5         scales_0.3.0
[34] Hmisc_3.17-0         MASS_7.3-44          splines_3.2.2
[37] xtable_1.7-4         colorspace_1.2-6     stringi_1.0-1
[40] acepack_1.3-3.3      munsell_0.4.2
DESeq2 • 2.8k views
ADD COMMENT
5
Entering edit mode
@mikelove
Last seen 8 hours ago
United States

hi Bruce,

After exploring ways to speed up the rlog, I came to prefer if users use the varianceStabilizingTransformation for datasets with many (e.g. 100s) of samples. The rlog has nice properties that we show in the paper, but it does require fitting a parameter for each sample. The 'fast' rlog was an approximation I was working on, but I came to think that the VST is preferable. The VST is just applying a function to the matrix of counts, so it's even faster. The only bottleneck with VST is estimating the dispersion trend (which the rlog also required). If you've already estimated dispersion (after DESeq() for example), you can use:

vsd <- varianceStabilizingTransformation(dds, blind=FALSE)
plotPCA(vsd)

which should take less than a second to return.

I have a fast routine for estimating the dispersion trend, which I will probably incorporate into a function at some point have now added to the devel branch as a function called vst() as of February 2016.

ADD COMMENT
0
Entering edit mode

Hi Michael,

many thanks for the answer, I will change scripts to reflect.

Bruce.

ADD REPLY
0
Entering edit mode

If the objective is to just plot a PCA, why would you specify blind=FALSE?

ADD REPLY
1
Entering edit mode

There is some discussion of this in the vignette, but basically, if there are many large differences across conditions, then blind=TRUE (the default) "sees" this as variability and will perhaps "over-transform" the data to temper this dispersion. I'm speaking very loosely here, but that's the idea. Specifying blind=FALSE, the transformations will only consider the within-condition variability, and so will result in a transformation which is closer to log2. For more comparison, check out the transformation section of the vignette. And for a very fast PCA plot you can always try normTransform(), which just corrects for library size, adds a pseudocount and log transforms. Until I write up the fast routine for VST, this is definitely the fastest way to produce transformed data.

ADD REPLY

Login before adding your answer.

Traffic: 864 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6