rtracklayer::import()-ing .bed.gz files from a URL
1
0
Entering edit mode
Peter Hickey ▴ 470
@petehaitch
Last seen 14 hours ago
Walter and Eliza Hall Institute of Medi…

Should I be able to import a gzipped BED file from a URL? I'm unsure if this is user error or a bug in rtracklayer::import(). Any advice is appreciated:

> suppressPackageStartupMessages(library(rtracklayer))
# Directly import()-ing fails
> import(BEDFile("http://hgdownload.cse.ucsc.edu/goldenPath/hg19/encodeDCC/wgEncodeMapability/wgEncodeDacMapabilityConsensusExcludable.bed.gz"))
Error in pushBack(line, con) :
  can only push back on text-mode connections

# Downloading and then importing works
> a <- tempdir()
> download.file("http://hgdownload.cse.ucsc.edu/goldenPath/hg19/encodeDCC/wgEncodeMapability/wgEncodeDacMapabilityConsensusExcludable.bed.gz", destfile = file.path(a, "wgEncodeDacMapabilityConsensusExcludable.bed.gz"))
trying URL 'http://hgdownload.cse.ucsc.edu/goldenPath/hg19/encodeDCC/wgEncodeMapability/wgEncodeDacMapabilityConsensusExcludable.bed.gz'
Content type 'application/x-gzip' length 4731 bytes
==================================================
downloaded 4731 bytes

> import(file.path(a, "wgEncodeDacMapabilityConsensusExcludable.bed.gz"))
GRanges object with 411 ranges and 2 metadata columns:
        seqnames               ranges strand |                    name
           <Rle>            <IRanges>  <Rle> |             <character>
    [1]     chr1   [ 564450,  570371]      * | High_Mappability_island
    [2]     chr1   [ 724137,  727043]      * |        Satellite_repeat
    [3]     chr1   [ 825007,  825115]      * |                BSR/Beta
    [4]     chr1   [2583335, 2634374]      * |  Low_mappability_island
    [5]     chr1   [4363065, 4363242]      * |                (CATTC)n
    ...      ...                  ...    ... .                     ...
  [407]     chrY [28555027, 28555353]      * |                    TAR1
  [408]     chrY [28784130, 28819695]      * |        Satellite_repeat
  [409]     chrY [58819368, 58917648]      * |                (CATTC)n
  [410]     chrY [58971914, 58997782]      * |                (CATTC)n
  [411]     chrY [59361268, 59362785]      * |                    TAR1
            score
        <numeric>
    [1]      1000
    [2]      1000
    [3]      1000
    [4]      1000
    [5]      1000
    ...       ...
  [407]      1000
  [408]      1000
  [409]      1000
  [410]      1000
  [411]      1000
  -------
  seqinfo: 25 sequences from an unspecified genome; no seqlengths

> sessionInfo()
R Under development (unstable) (2016-03-11 r70310)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X 10.11.3 (El Capitan)

locale:
[1] en_AU.UTF-8/en_AU.UTF-8/en_AU.UTF-8/C/en_AU.UTF-8/en_AU.UTF-8

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets
[8] methods   base

other attached packages:
[1] rtracklayer_1.31.7    GenomicRanges_1.23.24 GenomeInfoDb_1.7.6
[4] IRanges_2.5.40        S4Vectors_0.9.43      BiocGenerics_0.17.3
[7] repete_0.0.0.9002     devtools_1.10.0

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.3                 XVector_0.11.7
 [3] magrittr_1.5                zlibbioc_1.17.1
 [5] GenomicAlignments_1.7.20    BiocParallel_1.5.20
 [7] stringr_1.0.0               tools_3.3.0
 [9] SummarizedExperiment_1.1.22 Biobase_2.31.3
[11] digest_0.6.9                pryr_0.1.2
[13] bitops_1.0-6                codetools_0.2-14
[15] RCurl_1.95-4.8              memoise_1.0.0
[17] stringi_1.0-1               Biostrings_2.39.12
[19] Rsamtools_1.23.5            XML_3.98-1.4
rtracklayer import bed • 1.3k views
ADD COMMENT
0
Entering edit mode
@michael-lawrence-3846
Last seen 6 weeks ago
United States

It's true that pushBack() will not work with gzcon(). That's because it's always binary. I'm not sure why it does not have the option for text mode. Will raise this issue with R core.

ADD COMMENT
0
Entering edit mode

Thanks, Michael.

ADD REPLY
1
Entering edit mode

I hacked R to make this work. Hopefully it will get into 3.3 and then this should work after the April/May Bioconductor release.

ADD REPLY
0
Entering edit mode

Cheers, will give it a go then

ADD REPLY

Login before adding your answer.

Traffic: 275 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6