rtracklayer, ucscTableQuery -> wgEncodeSydhTfbs, stalling, WHY?
2
0
Entering edit mode
Guest User ★ 13k
@guest-user-4897
Last seen 9.6 years ago
Does anyone have a clue why the following code stalls. #Just testing functionality of the package >session <- browserSession("UCSC") >query <- ucscTableQuery(session, "wgEncodeSydhTfbs", GRangesForUCSCGenome("hg19","chr1", IRanges(100, 101))) >query """Get track 'wgEncodeSydhTfbs' within hg19:chr1:100-101""" getTable(query) 44 min and still running. This can't be right ? Anyone have a work around? -- output of sessionInfo(): R version 2.15.0 (2012-03-30) Platform: i686-pc-linux-gnu (32-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=C LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] rtracklayer_1.16.3 GenomicRanges_1.8.13 IRanges_1.14.4 [4] BiocGenerics_0.2.0 loaded via a namespace (and not attached): [1] Biostrings_2.24.1 bitops_1.0-4.1 BSgenome_1.24.0 RCurl_1.91-1 [5] Rsamtools_1.8.6 stats4_2.15.0 tools_2.15.0 XML_3.9-4 [9] zlibbioc_1.2.0 -- Sent via the guest posting facility at bioconductor.org.
• 1.1k views
ADD COMMENT
0
Entering edit mode
@fenton-christopher-graham-5504
Last seen 4.3 years ago
Does anyone have a clue why the following code stalls. #Just testing functionality of the package >session <- browserSession("UCSC") >query <- ucscTableQuery(session, "wgEncodeSydhTfbs", GRangesForUCSCGenome("hg19","chr1", IRanges(100, 101))) >query """Get track 'wgEncodeSydhTfbs' within hg19:chr1:100-101""" getTable(query) 44 min and still running. This can't be right ? Anyone have a work around? Chris R version 2.15.0 (2012-03-30) Platform: i686-pc-linux-gnu (32-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=C LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] rtracklayer_1.16.3 GenomicRanges_1.8.13 IRanges_1.14.4 [4] BiocGenerics_0.2.0 loaded via a namespace (and not attached): [1] Biostrings_2.24.1 bitops_1.0-4.1 BSgenome_1.24.0 RCurl_1.91-1 [5] Rsamtools_1.8.6 stats4_2.15.0 tools_2.15.0 XML_3.9-4 [9] zlibbioc_1.2.0
ADD COMMENT
0
Entering edit mode
@fenton-christopher-graham-5504
Last seen 4.3 years ago
Does anyone have a clue why the following code stalls. #Just testing functionality of the package >session <- browserSession("UCSC") >query <- ucscTableQuery(session, "wgEncodeSydhTfbs", GRangesForUCSCGenome("hg19","chr1", IRanges(100, 101))) >query """Get track 'wgEncodeSydhTfbs' within hg19:chr1:100-101""" getTable(query) 44 min and still running. This can't be right ? Anyone have a work around? Chris R version 2.15.0 (2012-03-30) Platform: i686-pc-linux-gnu (32-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=C LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] rtracklayer_1.16.3 GenomicRanges_1.8.13 IRanges_1.14.4 [4] BiocGenerics_0.2.0 loaded via a namespace (and not attached): [1] Biostrings_2.24.1 bitops_1.0-4.1 BSgenome_1.24.0 RCurl_1.91-1 [5] Rsamtools_1.8.6 stats4_2.15.0 tools_2.15.0 XML_3.9-4 [9] zlibbioc_1.2.0
ADD COMMENT
0
Entering edit mode
On Wed, Sep 19, 2012 at 8:54 AM, Fenton Christopher Graham < christopher.fenton@uit.no> wrote: > Does anyone have a clue why the following code stalls. > > #Just testing functionality of the package > > >session <- browserSession("UCSC") > >query <- ucscTableQuery(session, "wgEncodeSydhTfbs", > GRangesForUCSCGenome("hg19","chr1", IRanges(100, 101))) > > >query > > """Get track 'wgEncodeSydhTfbs' within hg19:chr1:100-101""" > > getTable(query) > > 44 min and still running. > This can't be right ? > > > Anyone have a work around? > > This is an interesting little debugging problem. If you run with something like options(error=recover) and interrupt after a little while you'll see a complicated traceback and you can poke around in the stack. Basically, the query you have issued involves 781 tables. A checking operation is applied to each of them. If you really want this computation, you may have to wait. a little more detail Browse[3]> object Get track 'wgEncodeSydhTfbs' within hg19:chr1:100-101 Browse[3]> tableName(object) NULL this sends us into 4: tableNames(object) 5: tableNames(object) 6: .local(object, ...) 7: sapply(tables, checkOutput) 8: sapply(tables, checkOutput) 9: lapply(X = X, FUN = FUN, ...) 10: FUN(c("wgEncodeSydhTfbsGm12878Bhlhe40cIggmusPk", "wgEncodeSydhTfbsGm12878Bh with a vector of 781 tables to be checked. perhaps you can focus your query, or retrieve the specific data you want for local import to R > Chris > > R version 2.15.0 (2012-03-30) > Platform: i686-pc-linux-gnu (32-bit) > > locale: > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 > [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 > [7] LC_PAPER=C LC_NAME=C > [9] LC_ADDRESS=C LC_TELEPHONE=C > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] rtracklayer_1.16.3 GenomicRanges_1.8.13 IRanges_1.14.4 > [4] BiocGenerics_0.2.0 > > loaded via a namespace (and not attached): > [1] Biostrings_2.24.1 bitops_1.0-4.1 BSgenome_1.24.0 RCurl_1.91-1 > [5] Rsamtools_1.8.6 stats4_2.15.0 tools_2.15.0 XML_3.9-4 > [9] zlibbioc_1.2.0 > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
Thanks to Fenton for the report and Vince for looking into this. The problem as Vince indicated is that there are a huge number of tables for this track. You'll want to pass the table argument to select the one you actually want. By default, rtracklayer chooses the first table. There were some inefficiencies that caused the query to hang, and I've fixed those in devel. I've also made BED parsing more robust to empty inputs, which is what we get with the query range below. Michael On Wed, Sep 19, 2012 at 7:44 AM, Vincent Carey <stvjc@channing.harvard.edu>wrote: > On Wed, Sep 19, 2012 at 8:54 AM, Fenton Christopher Graham < > christopher.fenton@uit.no> wrote: > > > Does anyone have a clue why the following code stalls. > > > > #Just testing functionality of the package > > > > >session <- browserSession("UCSC") > > >query <- ucscTableQuery(session, "wgEncodeSydhTfbs", > > GRangesForUCSCGenome("hg19","chr1", IRanges(100, 101))) > > > > >query > > > > """Get track 'wgEncodeSydhTfbs' within hg19:chr1:100-101""" > > > > getTable(query) > > > > 44 min and still running. > > This can't be right ? > > > > > > Anyone have a work around? > > > > > This is an interesting little debugging problem. If you run with something > like options(error=recover) and interrupt after a little while you'll see a > complicated traceback and you can poke around in the stack. > > Basically, the query you have issued involves 781 tables. A checking > operation is applied to each of them. If you really want this computation, > you may have to wait. > > > a little more detail > > Browse[3]> object > Get track 'wgEncodeSydhTfbs' within hg19:chr1:100-101 > > > Browse[3]> tableName(object) > NULL > > this sends us into > > 4: tableNames(object) > 5: tableNames(object) > 6: .local(object, ...) > 7: sapply(tables, checkOutput) > 8: sapply(tables, checkOutput) > 9: lapply(X = X, FUN = FUN, ...) > 10: FUN(c("wgEncodeSydhTfbsGm12878Bhlhe40cIggmusPk", > "wgEncodeSydhTfbsGm12878Bh > > with a vector of 781 tables to be checked. perhaps you can focus your > query, or retrieve the specific data you want for local import to R > > > > Chris > > > > R version 2.15.0 (2012-03-30) > > Platform: i686-pc-linux-gnu (32-bit) > > > > locale: > > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > > [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 > > [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 > > [7] LC_PAPER=C LC_NAME=C > > [9] LC_ADDRESS=C LC_TELEPHONE=C > > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > > > > attached base packages: > > [1] stats graphics grDevices utils datasets methods base > > > > other attached packages: > > [1] rtracklayer_1.16.3 GenomicRanges_1.8.13 IRanges_1.14.4 > > [4] BiocGenerics_0.2.0 > > > > loaded via a namespace (and not attached): > > [1] Biostrings_2.24.1 bitops_1.0-4.1 BSgenome_1.24.0 RCurl_1.91-1 > > [5] Rsamtools_1.8.6 stats4_2.15.0 tools_2.15.0 XML_3.9-4 > > [9] zlibbioc_1.2.0 > > _______________________________________________ > > Bioconductor mailing list > > Bioconductor@r-project.org > > https://stat.ethz.ch/mailman/listinfo/bioconductor > > Search the archives: > > http://news.gmane.org/gmane.science.biology.informatics.conductor > > > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > [[alternative HTML version deleted]]
ADD REPLY

Login before adding your answer.

Traffic: 806 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6