Entering edit mode
Hi everybody.
I have managed to spot some strange (at least from a newbie point of
view) behaviour in the rtracklayer package. I have set up a small
example for this:
library(rtracklayer)
s <- browserSession()
genome(s) <- 'hg19'
track <- 'wgEncodeBroadHistone'
table.name <- 'wgEncodeBroadHistoneGm12878CtcfStdPk'
q <- ucscTableQuery(s, track=track, table=table.name)
ex1 <- getTable(q)
ex2 <- track(q)
ex3 <- track(q, asRangedData=FALSE)
Then, I show the contents for the first element of the three result
datasets (data.frame, RangedData and GRanges, respectively):
> ex1[1,]
bin chrom chromStart chromEnd name score strand signalValue pValue
qValue
1 3 chr1 150941733 151007265 . 297 . 2.98199 13
-1
> ex2[1,]
UCSC track 'wgEncodeBroadHistoneGm12878CtcfStdPk'
UCSCData with 1 row and 3 value columns across 93 spaces
space ranges | name score strand
<factor> <iranges> | <character> <numeric> <factor>
1 chr1 [150941734, 151007265] | NA 297 *
> ex3[1]
GRanges with 1 range and 2 metadata columns:
seqnames ranges strand | name score
<rle> <iranges> <rle> | <character> <numeric>
[1] chr1 [150941734, 151007265] * | <na> 297
---
seqlengths:
chr1 chr2 ... chrUn_gl000249
249250621 243199373 ... 38502
I have noticed that the starting position of the range is one base
higher in the ranges-based objects than in the original table. Don't
know if this is an error inside the track function() or something I am
missing. This mistake occurs for every element, not only for the first
one.
> all(start(ex3) == ex1$chromStart + 1)
[1] TRUE
My sessionInfo:
> sessionInfo()
R version 2.15.2 (2012-10-26)
Platform: x86_64-pc-linux-gnu (64-bit)
locale:
[1] LC_CTYPE=es_ES.UTF-8 LC_NUMERIC=C
[3] LC_TIME=es_ES.UTF-8 LC_COLLATE=es_ES.UTF-8
[5] LC_MONETARY=es_ES.UTF-8 LC_MESSAGES=es_ES.UTF-8
[7] LC_PAPER=C LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=es_ES.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] rtracklayer_1.18.2 GenomicRanges_1.10.5 IRanges_1.16.4
[4] BiocGenerics_0.4.0
loaded via a namespace (and not attached):
[1] Biostrings_2.26.2 bitops_1.0-5
[3] BSgenome_1.26.1 BSgenome.Hsapiens.UCSC.hg19_1.3.19
[5] parallel_2.15.2 RCurl_1.95-3
[7] Rsamtools_1.10.2 stats4_2.15.2
[9] tcltk_2.15.2 tools_2.15.2
[11] XML_3.95-0.1 zlibbioc_1.4.0
Any hint will be much appreciated. It's not a big problem, but quite
interesting.
Regards,
Gus