[BigWigFile] import as RLE vs default and length comparaison
Entering edit mode
Lenny186 • 0
Last seen 6 weeks ago


I'm trying to understand why there's a length difference in the the total sum of width between these two code :

    ah= AnnotationHub()
    H3K27me3_r <- query(ah, c("H3K27me3","E003","fc.signal"))[["AH32033"]]
    gr2 <- GRanges(seqnames ="chr2", ranges=IRanges(start=start(Hsapiens$chr2), end=end(Hsapiens$chr2)))

    #first measure
    H3K27me3_r_gr_byrle <- import(H3K27me3_r, which = gr2, as = "Rle")

    #Second measure
    H3K27me3_r_gr <- import(H3K27me3_r, which = gr2)

From my understanding : the first exemple; importing as RLE must generate a list of value vs length and will aggregate repeated values in the length tags. Thus, sum(length) must reflect the total of the score occurence (in relation with the IRanges).

The second example is not importing as RLE, thus, i will have a Grange with the IRanges and the respectives scores. In conclusion, I suppose that both measure must be the same, but it's not the case :

1st example ; sum(runLength(H3K27me3_r_gr_byrle$chr2)) = 243199373
2nd example : sum(width(H3K27me3_r_gr))= 243185735

Why this slight difference ?

Here's my sessionInfo()

R version 4.1.0 (2021-05-18) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 10 x64 (build 19044)

Matrix products: default

locale: [1] LC_COLLATE=French_France.1252 LC_CTYPE=French_France.1252 LC_MONETARY=French_France.1252 LC_NUMERIC=C

attached base packages: [1] stats4 stats graphics grDevices utils datasets methods base

other attached packages: [1] TxDb.Hsapiens.UCSC.hg19.knownGene_3.2.2 BSgenome.Hsapiens.UCSC.hg19_1.4.3 GenomicFeatures_1.46.5
AnnotationDbi_1.56.2 [5] Biobase_2.54.0
Rsamtools_2.10.0 BSgenome_1.62.0
rtracklayer_1.54.0 [9] GenomicRanges_1.46.1
Biostrings_2.62.0 GenomeInfoDb_1.30.1
XVector_0.34.0 [13] IRanges_2.28.0
S4Vectors_0.32.4 AnnotationHub_3.2.2
BiocFileCache_2.2.1 [17] dbplyr_2.2.1

loaded via a namespace (and not attached): [1] MatrixGenerics_1.6.0
httr_1.4.3 bit64_4.0.5
shiny_1.7.2 assertthat_0.2.1
interactiveDisplayBase_1.32.0 [7] BiocManager_1.30.18
blob_1.2.3 GenomeInfoDbData_1.2.7 yaml_2.3.5 progress_1.2.2 BiocVersion_3.14.0 [13] pillar_1.8.0 RSQLite_2.2.15
lattice_0.20-45 glue_1.6.2
digest_0.6.29 promises_1.2.0.1 [19] htmltools_0.5.3 httpuv_1.6.5
Matrix_1.4-1 XML_3.99-0.10
pkgconfig_2.0.3 biomaRt_2.50.3 [25] zlibbioc_1.40.0 purrr_0.3.4
xtable_1.8-4 later_1.3.0
BiocParallel_1.28.3 tibble_3.1.8 [31] KEGGREST_1.34.0 generics_0.1.3
ellipsis_0.3.2 cachem_1.0.6
SummarizedExperiment_1.24.0 cli_3.3.0 [37] magrittr_2.0.3 crayon_1.5.1 mime_0.12 memoise_2.0.1 fansi_1.0.3 xml2_1.3.3 [43] tools_4.1.0 prettyunits_1.1.1
hms_1.1.1 BiocIO_1.4.0
lifecycle_1.0.1 matrixStats_0.62.0 [49] stringr_1.4.0 DelayedArray_0.20.0
compiler_4.1.0 rlang_1.0.4 grid_4.1.0 RCurl_1.98-1.8 [55] rjson_0.2.21
rappdirs_0.3.3 bitops_1.0-7
restfulr_0.0.15 DBI_1.1.3 curl_4.3.2 [61] R6_2.5.1 GenomicAlignments_1.30.0
dplyr_1.0.9 fastmap_1.1.0 bit_4.0.4 utf8_1.2.2 [67] filelock_1.0.2
stringi_1.7.6 parallel_4.1.0 Rcpp_1.0.9 vctrs_0.4.1 png_0.1-7 [73] tidyselect_1.1.2

Thanks a lot. Lenny

BSgenome.Hsapiens.UCSC.hg19 rtracklayer AnnotationHub • 180 views
Entering edit mode
Last seen 3 hours ago
United States

The Rle will include everything in chr2, whereas the GRanges will not.

> all.equal(width(H3K27me3_r_gr),runLength( H3K27me3_r_gr_byrle$chr2)) 
[1] "Numeric: lengths (7618643, 7618644) differ"
> all.equal(width(H3K27me3_r_gr),runLength( H3K27me3_r_gr_byrle$chr2)[-7618644]) 
[1] TRUE
> tail(runLength( H3K27me3_r_gr_byrle$chr2))
[1]   164   114   669    27    74 13638
> tail(width(H3K27me3_r_gr))
[1] 114 164 114 669  27  74
> end(gr2) - tail(end(H3K27me3_r_gr), 1)
[1] 13638
Entering edit mode

Oh ! these are the 13638 that appears 0 times ! thanks a lot it helps me to understand better RLE :)


Login before adding your answer.

Traffic: 534 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6