Unusually long running times for getMeth function since latest release
2
0
Entering edit mode
Alejandro Reyes ★ 1.9k
@alejandro-reyes-5124
Last seen 5 months ago
Novartis Institutes for BioMedical Rese…

Hello, 

I have been using the getMeth function from the bsseq package. I have not been able to make the function getMeth to complete in the current release. For example, the same line of code that took a bit more than a minute in old releases (see below) has been running in the current release for ~1 hours without completing. I could spot that the code of the new release is using the DelayedArray package, so I suspect the problem might be there?

 

> system.time( meth <- getMeth( wgbs, ranges, type="raw", what="perRegion" ) )                                                                                

    user  system elapsed

  69.533   0.828  70.448

 

Old Session Info:
 

> sessionInfo()

 R version 3.3.2 (2016-10-31)

 Platform: x86_64-pc-linux-gnu (64-bit)

 Running under: Red Hat Enterprise Linux Server release 6.5 (Santiago)


 locale:

  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C

  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8

  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8

  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C

  [9] LC_ADDRESS=C               LC_TELEPHONE=C

 [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:

 [1] parallel  stats4    stats     graphics  grDevices utils     datasets

 [8] methods   base


 other attached packages:

  [1] bsseq_1.10.0               limma_3.30.13

  [3] SummarizedExperiment_1.4.0 Biobase_2.34.0

  [5] GenomicRanges_1.26.2       GenomeInfoDb_1.10.3

  [7] IRanges_2.8.1              S4Vectors_0.12.1

  [9] BiocGenerics_0.20.0        BiocInstaller_1.24.0


 loaded via a namespace (and not attached):

  [1] Rcpp_0.12.10       XVector_0.14.0     zlibbioc_1.20.0    munsell_0.4.3

  [5] colorspace_1.3-2   lattice_0.20-34    plyr_1.8.4         tools_3.3.2

  [9] grid_3.3.2         data.table_1.10.4  R.oo_1.21.0        gtools_3.5.0

 [13] matrixStats_0.51.0 permute_0.9-4      Matrix_1.2-8       R.utils_2.5.0

 [17] bitops_1.0-6       RCurl_1.95-4.8     compiler_3.3.2     R.methodsS3_1.7.1

 [21] scales_0.4.1       locfit_1.5-9.1

 

Release sessionInfo():

> sessionInfo()                                                                                                                                             

 R version 3.4.0 (2017-04-21)

 Platform: x86_64-pc-linux-gnu (64-bit)

 Running under: Red Hat Enterprise Linux Server release 6.5 (Santiago)


 Matrix products: default

 BLAS: /data/aryee/areyes/software/lib64/R/lib/libRblas.so

 LAPACK: /data/aryee/areyes/software/lib64/R/lib/libRlapack.so


 locale:

  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C

  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8

  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8

  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C

  [9] LC_ADDRESS=C               LC_TELEPHONE=C

 [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

 attached base packages:

 [1] parallel  stats4    stats     graphics  grDevices utils     datasets

 [8] methods   base


 other attached packages:

  [1] bsseq_1.12.1               SummarizedExperiment_1.6.3

  [3] DelayedArray_0.2.7         matrixStats_0.52.2

  [5] Biobase_2.36.2             GenomicRanges_1.28.3

  [7] GenomeInfoDb_1.12.2        IRanges_2.10.2

  [9] S4Vectors_0.14.3           BiocGenerics_0.22.0

 [11] BiocInstaller_1.26.0


 loaded via a namespace (and not attached):

  [1] Rcpp_0.12.11            XVector_0.16.0          zlibbioc_1.22.0

  [4] munsell_0.4.3           colorspace_1.3-2        lattice_0.20-35

  [7] plyr_1.8.4              tools_3.4.0             grid_3.4.0

 [10] data.table_1.10.4       R.oo_1.21.0             gtools_3.5.0

[13] permute_0.9-4           Matrix_1.2-10           GenomeInfoDbData_0.99.0

 [16] R.utils_2.5.0           bitops_1.0-6            RCurl_1.95-4.8

 [19] limma_3.32.2            compiler_3.4.0          R.methodsS3_1.7.1

 [22] scales_0.4.1            locfit_1.5-9.1

 

bsseq delayedarray • 2.0k views
ADD COMMENT
1
Entering edit mode
Peter Hickey ▴ 740
@petehaitch
Last seen 7 weeks ago
WEHI, Melbourne, Australia

Thanks for your patience, Alejandro. I've committed a fix - it's available from https://github.com/PeteHaitch/bsseq. If you're able to, would you please give it a try to confirm that it fixes the performance regression. It requires that you are using Bioc-devel and can then be installed with BiocInstaller::biocLite("PeteHaitch/bsseq"). I'll co-ordinate with Kasper to get this merged into Bioc-devel and back ported to Bioc-release. However, this might take a few days as Kasper and I are both travelling. 

Cheers,

Pete

ADD COMMENT
0
Entering edit mode

Your fix worked for me as well - Thanks a lot!

ADD REPLY
0
Entering edit mode
@kasper-daniel-hansen-2979
Last seen 18 months ago
United States
This should not happen, but clearly it does. We could use (a lot) more information about what is going on, including the length of the BSseq object and the ranges. Could you perhaps post this as an issue on our Github page. What is the backend of the BSseq object? Best, Kasper On Thu, Jun 29, 2017 at 5:15 PM, Alejandro Reyes [bioc] < noreply@bioconductor.org> wrote: > Activity on a post you are following on support.bioconductor.org > > User Alejandro Reyes <https: support.bioconductor.org="" u="" 5124=""/> wrote Question: > Unusually long running times for getMeth function since latest release > <https: support.bioconductor.org="" p="" 97611=""/>: > > Hello, > > I have been using the getMeth function from the bsseq package. I have not > been able to make the function getMeth to complete in the current release. > For example, the same line of code that took a bit more than a minute in > old releases (see below) has been running in the current release for ~1 > hours without completing. I could spot that the code of the new release is > using the DelayedArray package, so I suspect the problem might be there? > > > > > *system.time( meth <- getMeth( wgbs, ranges, type=*"raw"*, what=* > "perRegion"* ) ) > * > > user system elapsed > > 69.533 0.828 70.448 > > > > Old Session Info: > > > > *sessionInfo()* > > R version 3.3.2 (2016-10-31) > > Platform: x86_64-pc-linux-gnu (64-bit) > > Running under: Red Hat Enterprise Linux Server release 6.5 (Santiago) > > > locale: > > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > > [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 > > [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 > > [7] LC_PAPER=en_US.UTF-8 LC_NAME=C > > [9] LC_ADDRESS=C LC_TELEPHONE=C > > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > > attached base packages: > > [1] parallel stats4 stats graphics grDevices utils datasets > > [8] methods base > > > other attached packages: > > [1] bsseq_1.10.0 limma_3.30.13 > > [3] SummarizedExperiment_1.4.0 Biobase_2.34.0 > > [5] GenomicRanges_1.26.2 GenomeInfoDb_1.10.3 > > [7] IRanges_2.8.1 S4Vectors_0.12.1 > > [9] BiocGenerics_0.20.0 BiocInstaller_1.24.0 > > > loaded via a namespace (and not attached): > > [1] Rcpp_0.12.10 XVector_0.14.0 zlibbioc_1.20.0 munsell_0.4.3 > > [5] colorspace_1.3-2 lattice_0.20-34 plyr_1.8.4 tools_3.3.2 > > [9] grid_3.3.2 data.table_1.10.4 R.oo_1.21.0 gtools_3.5.0 > > [13] matrixStats_0.51.0 permute_0.9-4 Matrix_1.2-8 R.utils_2.5.0 > > [17] bitops_1.0-6 RCurl_1.95-4.8 compiler_3.3.2 R.methodsS3_1.7.1 > > [21] scales_0.4.1 locfit_1.5-9.1 > > > > Release sessionInfo(): > > > *sessionInfo() * > > R version 3.4.0 (2017-04-21) > > Platform: x86_64-pc-linux-gnu (64-bit) > > Running under: Red Hat Enterprise Linux Server release 6.5 (Santiago) > > > Matrix products: default > > BLAS: /data/aryee/areyes/software/lib64/R/lib/libRblas.so > > LAPACK: /data/aryee/areyes/software/lib64/R/lib/libRlapack.so > > > locale: > > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > > [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 > > [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 > > [7] LC_PAPER=en_US.UTF-8 LC_NAME=C > > [9] LC_ADDRESS=C LC_TELEPHONE=C > > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > > attached base packages: > > [1] parallel stats4 stats graphics grDevices utils datasets > > [8] methods base > > > other attached packages: > > [1] bsseq_1.12.1 SummarizedExperiment_1.6.3 > > [3] DelayedArray_0.2.7 matrixStats_0.52.2 > > [5] Biobase_2.36.2 GenomicRanges_1.28.3 > > [7] GenomeInfoDb_1.12.2 IRanges_2.10.2 > > [9] S4Vectors_0.14.3 BiocGenerics_0.22.0 > > [11] BiocInstaller_1.26.0 > > > loaded via a namespace (and not attached): > > [1] Rcpp_0.12.11 XVector_0.16.0 zlibbioc_1.22.0 > > [4] munsell_0.4.3 colorspace_1.3-2 lattice_0.20-35 > > [7] plyr_1.8.4 tools_3.4.0 grid_3.4.0 > > [10] data.table_1.10.4 R.oo_1.21.0 gtools_3.5.0 > > [13] permute_0.9-4 Matrix_1.2-10 GenomeInfoDbData_0.99.0 > > [16] R.utils_2.5.0 bitops_1.0-6 RCurl_1.95-4.8 > > [19] limma_3.32.2 compiler_3.4.0 R.methodsS3_1.7.1 > > [22] scales_0.4.1 locfit_1.5-9.1 > > > > ------------------------------ > > Post tags: bsseq, delayedarray > > You may reply via email or visit Unusually long running times for getMeth function since latest release >
ADD COMMENT
0
Entering edit mode
I noticed this too but haven't had time to fix. I have an inkling of what's causing this and I'll try to fix over the weekend On Thu., 29 Jun. 2017, 5:48 pm Kasper Daniel Hansen [bioc], < noreply@bioconductor.org> wrote: > Activity on a post you are following on support.bioconductor.org > > User Kasper Daniel Hansen <https: support.bioconductor.org="" u="" 2979=""/> > wrote Answer: Unusually long running times for getMeth function since > latest release <https: support.bioconductor.org="" p="" 97611="" #97614="">: > > This should not happen, but clearly it does. We could use (a lot) more > information about what is going on, including the length of the BSseq > object and the ranges. Could you perhaps post this as an issue on our > Github page. What is the backend of the BSseq object? Best, Kasper On Thu, > Jun 29, 2017 at 5:15 PM, Alejandro Reyes [bioc] < noreply@bioconductor.org> > wrote: > Activity on a post you are following on support.bioconductor.org > > > User Alejandro Reyes <https: support.bioconductor.org="" u=""> 5124=""/> wrote Question: > Unusually long running times for getMeth > function since latest release > <https: support.bioconductor.org="" p=""> 97611=""/>: > > Hello, > > I have been using the getMeth function from the > bsseq package. I have not > been able to make the function getMeth to > complete in the current release. > For example, the same line of code that > took a bit more than a minute in > old releases (see below) has been > running in the current release for ~1 > hours without completing. I could > spot that the code of the new release is > using the DelayedArray package, > so I suspect the problem might be there? > > > > > *system.time( meth <- > getMeth( wgbs, ranges, type=*"raw"*, what=* > "perRegion"* ) ) > * > > user > system elapsed > > 69.533 0.828 70.448 > > > > Old Session Info: > > > > > *sessionInfo()* > > R version 3.3.2 (2016-10-31) > > Platform: > x86_64-pc-linux-gnu (64-bit) > > Running under: Red Hat Enterprise Linux > Server release 6.5 (Santiago) > > > locale: > > [1] LC_CTYPE=en_US.UTF-8 > LC_NUMERIC=C > > [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 > > [5] > LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 > > [7] > LC_PAPER=en_US.UTF-8 LC_NAME=C > > [9] LC_ADDRESS=C LC_TELEPHONE=C > > [11] > LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > > attached base packages: > > > [1] parallel stats4 stats graphics grDevices utils datasets > > [8] > methods base > > > other attached packages: > > [1] bsseq_1.10.0 > limma_3.30.13 > > [3] SummarizedExperiment_1.4.0 Biobase_2.34.0 > > [5] > GenomicRanges_1.26.2 GenomeInfoDb_1.10.3 > > [7] IRanges_2.8.1 > S4Vectors_0.12.1 > > [9] BiocGenerics_0.20.0 BiocInstaller_1.24.0 > > > > loaded via a namespace (and not attached): > > [1] Rcpp_0.12.10 > XVector_0.14.0 zlibbioc_1.20.0 munsell_0.4.3 > > [5] colorspace_1.3-2 > lattice_0.20-34 plyr_1.8.4 tools_3.3.2 > > [9] grid_3.3.2 data.table_1.10.4 > R.oo_1.21.0 gtools_3.5.0 > > [13] matrixStats_0.51.0 permute_0.9-4 > Matrix_1.2-8 R.utils_2.5.0 > > [17] bitops_1.0-6 RCurl_1.95-4.8 > compiler_3.3.2 R.methodsS3_1.7.1 > > [21] scales_0.4.1 locfit_1.5-9.1 > > > > > Release sessionInfo(): > > > *sessionInfo() * > > R version 3.4.0 > (2017-04-21) > > Platform: x86_64-pc-linux-gnu (64-bit) > > Running under: > Red Hat Enterprise Linux Server release 6.5 (Santiago) > > > Matrix > products: default > > BLAS: > /data/aryee/areyes/software/lib64/R/lib/libRblas.so > > LAPACK: > /data/aryee/areyes/software/lib64/R/lib/libRlapack.so > > > locale: > > [1] > LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > > [3] LC_TIME=en_US.UTF-8 > LC_COLLATE=en_US.UTF-8 > > [5] LC_MONETARY=en_US.UTF-8 > LC_MESSAGES=en_US.UTF-8 > > [7] LC_PAPER=en_US.UTF-8 LC_NAME=C > > [9] > LC_ADDRESS=C LC_TELEPHONE=C > > [11] LC_MEASUREMENT=en_US.UTF-8 > LC_IDENTIFICATION=C > > attached base packages: > > [1] parallel stats4 > stats graphics grDevices utils datasets > > [8] methods base > > > other > attached packages: > > [1] bsseq_1.12.1 SummarizedExperiment_1.6.3 > > [3] > DelayedArray_0.2.7 matrixStats_0.52.2 > > [5] Biobase_2.36.2 > GenomicRanges_1.28.3 > > [7] GenomeInfoDb_1.12.2 IRanges_2.10.2 > > [9] > S4Vectors_0.14.3 BiocGenerics_0.22.0 > > [11] BiocInstaller_1.26.0 > > > > loaded via a namespace (and not attached): > > [1] Rcpp_0.12.11 > XVector_0.16.0 zlibbioc_1.22.0 > > [4] munsell_0.4.3 colorspace_1.3-2 > lattice_0.20-35 > > [7] plyr_1.8.4 tools_3.4.0 grid_3.4.0 > > [10] > data.table_1.10.4 R.oo_1.21.0 gtools_3.5.0 > > [13] permute_0.9-4 > Matrix_1.2-10 GenomeInfoDbData_0.99.0 > > [16] R.utils_2.5.0 bitops_1.0-6 > RCurl_1.95-4.8 > > [19] limma_3.32.2 compiler_3.4.0 R.methodsS3_1.7.1 > > > [22] scales_0.4.1 locfit_1.5-9.1 > > > > ------------------------------ > > > Post tags: bsseq, delayedarray > > You may reply via email or visit Unusually > long running times for getMeth function since latest release > <https: support.bioconductor.org="" p="" 97611=""/> > > > ------------------------------ > > Post tags: bsseq, delayedarray > > You may reply via email or visit > A: Unusually long running times for getMeth function since latest release >
ADD REPLY
0
Entering edit mode

Thanks Peter and Kasper!

Peter, if you need a reproducible example (object + code) as Kasper suggested, let me know and I would be happy to provide one.


Alejandro

ADD REPLY
0
Entering edit mode
Hi Alejandro, Reproducible code and data would be appreciated. No need for it to be the full object, just enough for issue to be a noticeable drag on performance. Cheers, Pete On Fri., 30 Jun. 2017, 9:37 am Alejandro Reyes [bioc], < noreply@bioconductor.org> wrote: > Activity on a post you are following on support.bioconductor.org > > User Alejandro Reyes <https: support.bioconductor.org="" u="" 5124=""/> wrote Comment: > Unusually long running times for getMeth function since latest release > <https: support.bioconductor.org="" p="" 97611="" #97637="">: > > Thanks Peter and Kasper! > > Peter, if you need a reproducible example (object + code) as Kasper > suggested, let me know and I would be happy to provide one. > > > Alejandro > ------------------------------ > > Post tags: bsseq, delayedarray > > You may reply via email or visit > C: Unusually long running times for getMeth function since latest release >
ADD REPLY

Login before adding your answer.

Traffic: 952 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6