Search
Question: Unusually long running times for getMeth function since latest release
0
gravatar for Alejandro Reyes
4 months ago by
Alejandro Reyes1.5k
Dana-Farber Cancer Institute, Boston, USA
Alejandro Reyes1.5k wrote:

Hello, 

I have been using the getMeth function from the bsseq package. I have not been able to make the function getMeth to complete in the current release. For example, the same line of code that took a bit more than a minute in old releases (see below) has been running in the current release for ~1 hours without completing. I could spot that the code of the new release is using the DelayedArray package, so I suspect the problem might be there?

 

> system.time( meth <- getMeth( wgbs, ranges, type="raw", what="perRegion" ) )                                                                                

    user  system elapsed

  69.533   0.828  70.448

 

Old Session Info:
 

> sessionInfo()

 R version 3.3.2 (2016-10-31)

 Platform: x86_64-pc-linux-gnu (64-bit)

 Running under: Red Hat Enterprise Linux Server release 6.5 (Santiago)


 locale:

  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C

  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8

  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8

  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C

  [9] LC_ADDRESS=C               LC_TELEPHONE=C

 [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:

 [1] parallel  stats4    stats     graphics  grDevices utils     datasets

 [8] methods   base


 other attached packages:

  [1] bsseq_1.10.0               limma_3.30.13

  [3] SummarizedExperiment_1.4.0 Biobase_2.34.0

  [5] GenomicRanges_1.26.2       GenomeInfoDb_1.10.3

  [7] IRanges_2.8.1              S4Vectors_0.12.1

  [9] BiocGenerics_0.20.0        BiocInstaller_1.24.0


 loaded via a namespace (and not attached):

  [1] Rcpp_0.12.10       XVector_0.14.0     zlibbioc_1.20.0    munsell_0.4.3

  [5] colorspace_1.3-2   lattice_0.20-34    plyr_1.8.4         tools_3.3.2

  [9] grid_3.3.2         data.table_1.10.4  R.oo_1.21.0        gtools_3.5.0

 [13] matrixStats_0.51.0 permute_0.9-4      Matrix_1.2-8       R.utils_2.5.0

 [17] bitops_1.0-6       RCurl_1.95-4.8     compiler_3.3.2     R.methodsS3_1.7.1

 [21] scales_0.4.1       locfit_1.5-9.1

 

Release sessionInfo():

> sessionInfo()                                                                                                                                             

 R version 3.4.0 (2017-04-21)

 Platform: x86_64-pc-linux-gnu (64-bit)

 Running under: Red Hat Enterprise Linux Server release 6.5 (Santiago)


 Matrix products: default

 BLAS: /data/aryee/areyes/software/lib64/R/lib/libRblas.so

 LAPACK: /data/aryee/areyes/software/lib64/R/lib/libRlapack.so


 locale:

  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C

  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8

  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8

  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C

  [9] LC_ADDRESS=C               LC_TELEPHONE=C

 [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

 attached base packages:

 [1] parallel  stats4    stats     graphics  grDevices utils     datasets

 [8] methods   base


 other attached packages:

  [1] bsseq_1.12.1               SummarizedExperiment_1.6.3

  [3] DelayedArray_0.2.7         matrixStats_0.52.2

  [5] Biobase_2.36.2             GenomicRanges_1.28.3

  [7] GenomeInfoDb_1.12.2        IRanges_2.10.2

  [9] S4Vectors_0.14.3           BiocGenerics_0.22.0

 [11] BiocInstaller_1.26.0


 loaded via a namespace (and not attached):

  [1] Rcpp_0.12.11            XVector_0.16.0          zlibbioc_1.22.0

  [4] munsell_0.4.3           colorspace_1.3-2        lattice_0.20-35

  [7] plyr_1.8.4              tools_3.4.0             grid_3.4.0

 [10] data.table_1.10.4       R.oo_1.21.0             gtools_3.5.0

[13] permute_0.9-4           Matrix_1.2-10           GenomeInfoDbData_0.99.0

 [16] R.utils_2.5.0           bitops_1.0-6            RCurl_1.95-4.8

 [19] limma_3.32.2            compiler_3.4.0          R.methodsS3_1.7.1

 [22] scales_0.4.1            locfit_1.5-9.1

 

ADD COMMENTlink modified 4 months ago by Peter Hickey290 • written 4 months ago by Alejandro Reyes1.5k
1
gravatar for Peter Hickey
4 months ago by
Peter Hickey290
Johns Hopkins University, Baltimore, USA
Peter Hickey290 wrote:

Thanks for your patience, Alejandro. I've committed a fix - it's available from https://github.com/PeteHaitch/bsseq. If you're able to, would you please give it a try to confirm that it fixes the performance regression. It requires that you are using Bioc-devel and can then be installed with BiocInstaller::biocLite("PeteHaitch/bsseq"). I'll co-ordinate with Kasper to get this merged into Bioc-devel and back ported to Bioc-release. However, this might take a few days as Kasper and I are both travelling. 

Cheers,

Pete

ADD COMMENTlink modified 4 months ago • written 4 months ago by Peter Hickey290

Your fix worked for me as well - Thanks a lot!

ADD REPLYlink written 4 months ago by Alejandro Reyes1.5k
0
gravatar for Kasper Daniel Hansen
4 months ago by
United States
Kasper Daniel Hansen6.3k wrote:
This should not happen, but clearly it does. We could use (a lot) more information about what is going on, including the length of the BSseq object and the ranges. Could you perhaps post this as an issue on our Github page. What is the backend of the BSseq object? Best, Kasper On Thu, Jun 29, 2017 at 5:15 PM, Alejandro Reyes [bioc] < noreply@bioconductor.org> wrote: > Activity on a post you are following on support.bioconductor.org > > User Alejandro Reyes <https: support.bioconductor.org="" u="" 5124=""/> wrote Question: > Unusually long running times for getMeth function since latest release > <https: support.bioconductor.org="" p="" 97611=""/>: > > Hello, > > I have been using the getMeth function from the bsseq package. I have not > been able to make the function getMeth to complete in the current release. > For example, the same line of code that took a bit more than a minute in > old releases (see below) has been running in the current release for ~1 > hours without completing. I could spot that the code of the new release is > using the DelayedArray package, so I suspect the problem might be there? > > > > > *system.time( meth <- getMeth( wgbs, ranges, type=*"raw"*, what=* > "perRegion"* ) ) > * > > user system elapsed > > 69.533 0.828 70.448 > > > > Old Session Info: > > > > *sessionInfo()* > > R version 3.3.2 (2016-10-31) > > Platform: x86_64-pc-linux-gnu (64-bit) > > Running under: Red Hat Enterprise Linux Server release 6.5 (Santiago) > > > locale: > > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > > [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 > > [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 > > [7] LC_PAPER=en_US.UTF-8 LC_NAME=C > > [9] LC_ADDRESS=C LC_TELEPHONE=C > > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > > attached base packages: > > [1] parallel stats4 stats graphics grDevices utils datasets > > [8] methods base > > > other attached packages: > > [1] bsseq_1.10.0 limma_3.30.13 > > [3] SummarizedExperiment_1.4.0 Biobase_2.34.0 > > [5] GenomicRanges_1.26.2 GenomeInfoDb_1.10.3 > > [7] IRanges_2.8.1 S4Vectors_0.12.1 > > [9] BiocGenerics_0.20.0 BiocInstaller_1.24.0 > > > loaded via a namespace (and not attached): > > [1] Rcpp_0.12.10 XVector_0.14.0 zlibbioc_1.20.0 munsell_0.4.3 > > [5] colorspace_1.3-2 lattice_0.20-34 plyr_1.8.4 tools_3.3.2 > > [9] grid_3.3.2 data.table_1.10.4 R.oo_1.21.0 gtools_3.5.0 > > [13] matrixStats_0.51.0 permute_0.9-4 Matrix_1.2-8 R.utils_2.5.0 > > [17] bitops_1.0-6 RCurl_1.95-4.8 compiler_3.3.2 R.methodsS3_1.7.1 > > [21] scales_0.4.1 locfit_1.5-9.1 > > > > Release sessionInfo(): > > > *sessionInfo() * > > R version 3.4.0 (2017-04-21) > > Platform: x86_64-pc-linux-gnu (64-bit) > > Running under: Red Hat Enterprise Linux Server release 6.5 (Santiago) > > > Matrix products: default > > BLAS: /data/aryee/areyes/software/lib64/R/lib/libRblas.so > > LAPACK: /data/aryee/areyes/software/lib64/R/lib/libRlapack.so > > > locale: > > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > > [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 > > [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 > > [7] LC_PAPER=en_US.UTF-8 LC_NAME=C > > [9] LC_ADDRESS=C LC_TELEPHONE=C > > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > > attached base packages: > > [1] parallel stats4 stats graphics grDevices utils datasets > > [8] methods base > > > other attached packages: > > [1] bsseq_1.12.1 SummarizedExperiment_1.6.3 > > [3] DelayedArray_0.2.7 matrixStats_0.52.2 > > [5] Biobase_2.36.2 GenomicRanges_1.28.3 > > [7] GenomeInfoDb_1.12.2 IRanges_2.10.2 > > [9] S4Vectors_0.14.3 BiocGenerics_0.22.0 > > [11] BiocInstaller_1.26.0 > > > loaded via a namespace (and not attached): > > [1] Rcpp_0.12.11 XVector_0.16.0 zlibbioc_1.22.0 > > [4] munsell_0.4.3 colorspace_1.3-2 lattice_0.20-35 > > [7] plyr_1.8.4 tools_3.4.0 grid_3.4.0 > > [10] data.table_1.10.4 R.oo_1.21.0 gtools_3.5.0 > > [13] permute_0.9-4 Matrix_1.2-10 GenomeInfoDbData_0.99.0 > > [16] R.utils_2.5.0 bitops_1.0-6 RCurl_1.95-4.8 > > [19] limma_3.32.2 compiler_3.4.0 R.methodsS3_1.7.1 > > [22] scales_0.4.1 locfit_1.5-9.1 > > > > ------------------------------ > > Post tags: bsseq, delayedarray > > You may reply via email or visit Unusually long running times for getMeth function since latest release >
ADD COMMENTlink written 4 months ago by Kasper Daniel Hansen6.3k
I noticed this too but haven't had time to fix. I have an inkling of what's causing this and I'll try to fix over the weekend On Thu., 29 Jun. 2017, 5:48 pm Kasper Daniel Hansen [bioc], < noreply@bioconductor.org> wrote: > Activity on a post you are following on support.bioconductor.org > > User Kasper Daniel Hansen <https: support.bioconductor.org="" u="" 2979=""/> > wrote Answer: Unusually long running times for getMeth function since > latest release <https: support.bioconductor.org="" p="" 97611="" #97614="">: > > This should not happen, but clearly it does. We could use (a lot) more > information about what is going on, including the length of the BSseq > object and the ranges. Could you perhaps post this as an issue on our > Github page. What is the backend of the BSseq object? Best, Kasper On Thu, > Jun 29, 2017 at 5:15 PM, Alejandro Reyes [bioc] < noreply@bioconductor.org> > wrote: > Activity on a post you are following on support.bioconductor.org > > > User Alejandro Reyes <https: support.bioconductor.org="" u=""> 5124=""/> wrote Question: > Unusually long running times for getMeth > function since latest release > <https: support.bioconductor.org="" p=""> 97611=""/>: > > Hello, > > I have been using the getMeth function from the > bsseq package. I have not > been able to make the function getMeth to > complete in the current release. > For example, the same line of code that > took a bit more than a minute in > old releases (see below) has been > running in the current release for ~1 > hours without completing. I could > spot that the code of the new release is > using the DelayedArray package, > so I suspect the problem might be there? > > > > > *system.time( meth <- > getMeth( wgbs, ranges, type=*"raw"*, what=* > "perRegion"* ) ) > * > > user > system elapsed > > 69.533 0.828 70.448 > > > > Old Session Info: > > > > > *sessionInfo()* > > R version 3.3.2 (2016-10-31) > > Platform: > x86_64-pc-linux-gnu (64-bit) > > Running under: Red Hat Enterprise Linux > Server release 6.5 (Santiago) > > > locale: > > [1] LC_CTYPE=en_US.UTF-8 > LC_NUMERIC=C > > [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 > > [5] > LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 > > [7] > LC_PAPER=en_US.UTF-8 LC_NAME=C > > [9] LC_ADDRESS=C LC_TELEPHONE=C > > [11] > LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > > attached base packages: > > > [1] parallel stats4 stats graphics grDevices utils datasets > > [8] > methods base > > > other attached packages: > > [1] bsseq_1.10.0 > limma_3.30.13 > > [3] SummarizedExperiment_1.4.0 Biobase_2.34.0 > > [5] > GenomicRanges_1.26.2 GenomeInfoDb_1.10.3 > > [7] IRanges_2.8.1 > S4Vectors_0.12.1 > > [9] BiocGenerics_0.20.0 BiocInstaller_1.24.0 > > > > loaded via a namespace (and not attached): > > [1] Rcpp_0.12.10 > XVector_0.14.0 zlibbioc_1.20.0 munsell_0.4.3 > > [5] colorspace_1.3-2 > lattice_0.20-34 plyr_1.8.4 tools_3.3.2 > > [9] grid_3.3.2 data.table_1.10.4 > R.oo_1.21.0 gtools_3.5.0 > > [13] matrixStats_0.51.0 permute_0.9-4 > Matrix_1.2-8 R.utils_2.5.0 > > [17] bitops_1.0-6 RCurl_1.95-4.8 > compiler_3.3.2 R.methodsS3_1.7.1 > > [21] scales_0.4.1 locfit_1.5-9.1 > > > > > Release sessionInfo(): > > > *sessionInfo() * > > R version 3.4.0 > (2017-04-21) > > Platform: x86_64-pc-linux-gnu (64-bit) > > Running under: > Red Hat Enterprise Linux Server release 6.5 (Santiago) > > > Matrix > products: default > > BLAS: > /data/aryee/areyes/software/lib64/R/lib/libRblas.so > > LAPACK: > /data/aryee/areyes/software/lib64/R/lib/libRlapack.so > > > locale: > > [1] > LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > > [3] LC_TIME=en_US.UTF-8 > LC_COLLATE=en_US.UTF-8 > > [5] LC_MONETARY=en_US.UTF-8 > LC_MESSAGES=en_US.UTF-8 > > [7] LC_PAPER=en_US.UTF-8 LC_NAME=C > > [9] > LC_ADDRESS=C LC_TELEPHONE=C > > [11] LC_MEASUREMENT=en_US.UTF-8 > LC_IDENTIFICATION=C > > attached base packages: > > [1] parallel stats4 > stats graphics grDevices utils datasets > > [8] methods base > > > other > attached packages: > > [1] bsseq_1.12.1 SummarizedExperiment_1.6.3 > > [3] > DelayedArray_0.2.7 matrixStats_0.52.2 > > [5] Biobase_2.36.2 > GenomicRanges_1.28.3 > > [7] GenomeInfoDb_1.12.2 IRanges_2.10.2 > > [9] > S4Vectors_0.14.3 BiocGenerics_0.22.0 > > [11] BiocInstaller_1.26.0 > > > > loaded via a namespace (and not attached): > > [1] Rcpp_0.12.11 > XVector_0.16.0 zlibbioc_1.22.0 > > [4] munsell_0.4.3 colorspace_1.3-2 > lattice_0.20-35 > > [7] plyr_1.8.4 tools_3.4.0 grid_3.4.0 > > [10] > data.table_1.10.4 R.oo_1.21.0 gtools_3.5.0 > > [13] permute_0.9-4 > Matrix_1.2-10 GenomeInfoDbData_0.99.0 > > [16] R.utils_2.5.0 bitops_1.0-6 > RCurl_1.95-4.8 > > [19] limma_3.32.2 compiler_3.4.0 R.methodsS3_1.7.1 > > > [22] scales_0.4.1 locfit_1.5-9.1 > > > > ------------------------------ > > > Post tags: bsseq, delayedarray > > You may reply via email or visit Unusually > long running times for getMeth function since latest release > <https: support.bioconductor.org="" p="" 97611=""/> > > > ------------------------------ > > Post tags: bsseq, delayedarray > > You may reply via email or visit > A: Unusually long running times for getMeth function since latest release >
ADD REPLYlink written 4 months ago by Peter Hickey290

Thanks Peter and Kasper!

Peter, if you need a reproducible example (object + code) as Kasper suggested, let me know and I would be happy to provide one.


Alejandro

ADD REPLYlink written 4 months ago by Alejandro Reyes1.5k
Hi Alejandro, Reproducible code and data would be appreciated. No need for it to be the full object, just enough for issue to be a noticeable drag on performance. Cheers, Pete On Fri., 30 Jun. 2017, 9:37 am Alejandro Reyes [bioc], < noreply@bioconductor.org> wrote: > Activity on a post you are following on support.bioconductor.org > > User Alejandro Reyes <https: support.bioconductor.org="" u="" 5124=""/> wrote Comment: > Unusually long running times for getMeth function since latest release > <https: support.bioconductor.org="" p="" 97611="" #97637="">: > > Thanks Peter and Kasper! > > Peter, if you need a reproducible example (object + code) as Kasper > suggested, let me know and I would be happy to provide one. > > > Alejandro > ------------------------------ > > Post tags: bsseq, delayedarray > > You may reply via email or visit > C: Unusually long running times for getMeth function since latest release >
ADD REPLYlink written 4 months ago by Peter Hickey290
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 278 users visited in the last hour