coverage() output SimpleRleList cannot be converted to GRanges
1
1
Entering edit mode
@xiaotongyao23-14986
Last seen 3.3 years ago

Converting the SimpleRleList object to GRanges now does not work. Please advise the quickest fix. It probably has something to do with the seqlengths of the objects and the incompatibility between the new IRanges and GenomicRanges packages.

> > segs
GRanges object with 1794 ranges and 7 metadata columns:
seqnames                 ranges strand |        cn out.degree
<Rle>              <IRanges>  <Rle> | <numeric>  <numeric>
[1]        1   [       1,  6525379]      + |         2          1
[2]        1   [ 8734021,  9010795]      + |         2          1
[3]        1   [20098398, 23389898]      + |         1          1
[4]        1   [48284513, 51848768]      + |         0          0
[5]        1   [53472006, 58599187]      + |         1          1
...      ...                    ...    ... .       ...        ...
[1790]        X [115632002, 118016061]      - |         1          1
[1791]        X [ 75464885, 100132592]      + |         1          1
[1792]        X [118016062, 152851944]      - |         2          2
[1793]        X [ 57391044, 118016061]      - |         1          1
[1794]        X [ 57388373, 100132592]      + |         1          1
in.degree    degree     loose   tile.id  terminal
<numeric> <numeric> <logical> <numeric> <logical>
[1]         0         1         0         1         1
[2]         1         2         0       468         0
[3]         1         2         0       543         0
[4]         0         0         0       554         1
[5]         1         2         0       565         0
...       ...       ...       ...       ...       ...
[1790]         1         2         0       406         0
[1791]         1         2         0       405         0
[1792]         2         4         0       403         0
[1793]         1         2         0       408         0
[1794]         1         2         0       407         0
-------
seqinfo: 85 sequences from an unspecified genome
> cov = coverage(segs, weight="cn")
> cov
RleList of length 85
$1 numeric-Rle of length 249250621 with 85 runs Lengths: 6525379 43913 1389 107 ... 4478240 1542 2487237 Values : 8 4 6 8 ... 12 10 12$2
numeric-Rle of length 243199373 with 60 runs
Lengths: 24802397   115824 13134558  2411221 ...  8163708     4837 54777767
Values :        6        8        6        8 ...        6        4        6

$3 numeric-Rle of length 198022430 with 42 runs Lengths: 9349592 97052 271629 406624 ... 1285769 53 2771101 Values : 6 10 6 12 ... 12 10 12$4
numeric-Rle of length 191154276 with 41 runs
Lengths:  7327180     9290  6165343      459 ...   144997        1   469875
Values :        6        0        6        4 ...        6        8        4

$5 numeric-Rle of length 180915260 with 27 runs Lengths: 11509549 12258500 1738 6693852 ... 54399 1 484858 Values : 14 12 10 12 ... 4 16 10 ... <80 more elements> > as(cov, "GRanges") Error in .normargSeqlengths(value, seqnames(x)) : length of supplied 'seqlengths' must equal the number of sequences Enter a frame number, or 0 to exit 1: as(cov, "GRanges") 2: asMethod(object) 3: seqlengths<-(*tmp*, value = c(249250621, 243199373, 198022430, 19115427 4: seqlengths<-(*tmp*, value = c(249250621, 243199373, 198022430, 19115427 5: seqlengths<-(*tmp*, value = c(249250621, 243199373, 198022430, 19115427 6: seqlengths<-(*tmp*, value = c(249250621, 243199373, 198022430, 19115427 7: .normargSeqlengths(value, seqnames(x)) Selection: 0 > class(cov) [1] "SimpleRleList" attr(,"package") [1] "IRanges" > seqlengths(segs) 1 2 3 4 5 6 7 249250621 243199373 198022430 191154276 180915260 171115067 159138663 X 8 9 10 11 12 13 155270560 146364022 141213431 135534747 135006516 133851895 115169878 14 15 16 17 18 20 Y 107349540 102531392 90354753 81195210 78077248 63025520 59373566 19 22 21 M GL000195.1 GL000220.1 GL000192.1 59128983 51304566 48129895 16571 NA NA NA GL000225.1 GL000194.1 GL000193.1 GL000200.1 GL000222.1 GL000212.1 GL000223.1 NA NA NA NA NA NA NA GL000224.1 GL000219.1 GL000205.1 GL000215.1 GL000216.1 GL000217.1 GL000199.1 NA NA NA NA NA NA NA GL000211.1 GL000213.1 GL000218.1 GL000209.1 GL000221.1 GL000214.1 GL000228.1 NA NA NA NA NA NA NA GL000227.1 GL000191.1 GL000208.1 GL000198.1 GL000204.1 GL000233.1 GL000237.1 NA NA NA NA NA NA NA GL000230.1 GL000242.1 GL000243.1 GL000241.1 GL000236.1 GL000240.1 GL000206.1 NA NA NA NA NA NA NA GL000232.1 GL000234.1 GL000202.1 GL000238.1 GL000244.1 GL000248.1 GL000196.1 NA NA NA NA NA NA NA GL000249.1 GL000246.1 GL000203.1 GL000197.1 GL000245.1 GL000247.1 GL000201.1 NA NA NA NA NA NA NA GL000235.1 GL000239.1 GL000210.1 GL000231.1 GL000229.1 MT GL000226.1 NA NA NA NA NA NA NA GL000207.1 NA > seqlengths(cov) 1 2 3 4 5 6 7 NA NA NA NA NA NA NA X 8 9 10 11 12 13 NA NA NA NA NA NA NA 14 15 16 17 18 20 Y NA NA NA NA NA NA NA 19 22 21 M GL000195.1 GL000220.1 GL000192.1 NA NA NA NA NA NA NA GL000225.1 GL000194.1 GL000193.1 GL000200.1 GL000222.1 GL000212.1 GL000223.1 NA NA NA NA NA NA NA GL000224.1 GL000219.1 GL000205.1 GL000215.1 GL000216.1 GL000217.1 GL000199.1 NA NA NA NA NA NA NA GL000211.1 GL000213.1 GL000218.1 GL000209.1 GL000221.1 GL000214.1 GL000228.1 NA NA NA NA NA NA NA GL000227.1 GL000191.1 GL000208.1 GL000198.1 GL000204.1 GL000233.1 GL000237.1 NA NA NA NA NA NA NA GL000230.1 GL000242.1 GL000243.1 GL000241.1 GL000236.1 GL000240.1 GL000206.1 NA NA NA NA NA NA NA GL000232.1 GL000234.1 GL000202.1 GL000238.1 GL000244.1 GL000248.1 GL000196.1 NA NA NA NA NA NA NA GL000249.1 GL000246.1 GL000203.1 GL000197.1 GL000245.1 GL000247.1 GL000201.1 NA NA NA NA NA NA NA GL000235.1 GL000239.1 GL000210.1 GL000231.1 GL000229.1 MT GL000226.1 NA NA NA NA NA NA NA GL000207.1 NA > sessionInfo() R version 3.4.1 (2017-06-30) Platform: x86_64-pc-linux-gnu (64-bit) Running under: CentOS Linux 7 (Core) Matrix products: default BLAS: /nfs/sw/R/R-3.4.1/lib64/R/lib/libRblas.so LAPACK: /nfs/sw/R/R-3.4.1/lib64/R/lib/libRlapack.so locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] parallel stats4 stats graphics grDevices utils datasets [8] methods base other attached packages: [1] gGnome_0.1 RColorBrewer_1.1-2 rtracklayer_1.38.3 [4] gUtils_0.2.0 data.table_1.10.4-3 GenomicRanges_1.30.1 [7] GenomeInfoDb_1.14.0 IRanges_2.12.0 S4Vectors_0.16.0 [10] BiocGenerics_0.24.0 testthat_2.0.0 devtools_1.13.4 loaded via a namespace (and not attached): [1] Rcpp_0.12.15 lattice_0.20-35 [3] Rsamtools_1.30.0 Biostrings_2.46.0 [5] rprojroot_1.3-2 assertthat_0.2.0 [7] digest_0.6.15 slam_0.1-42 [9] R6_2.2.2 backports_1.1.2 [11] RSQLite_2.0 BiocInstaller_1.28.0 [13] httr_1.3.1 pillar_1.1.0 [15] zlibbioc_1.24.0 rlang_0.1.6 [17] GenomicFeatures_1.28.5 curl_3.1 [19] blob_1.1.0 Matrix_1.2-12 [21] desc_1.1.1 BiocParallel_1.12.0 [23] stringr_1.2.0 gTrack_0.1.0 [25] igraph_1.1.2 RCurl_1.95-4.8 [27] bit_1.1-12 biomaRt_2.32.1 [29] DelayedArray_0.4.1 compiler_3.4.1 [31] pkgconfig_2.0.1 Rcplex_0.3-3 [33] SummarizedExperiment_1.8.1 tibble_1.4.2 [35] GenomeInfoDbData_1.0.0 roxygen2_6.0.1 [37] matrixStats_0.53.1 XML_3.98-1.9 [39] crayon_1.3.4 withr_2.1.1 [41] GenomicAlignments_1.14.1 bitops_1.0-6 [43] commonmark_1.4 grid_3.4.1 [45] jsonlite_1.5 DBI_0.7 [47] git2r_0.21.0 magrittr_1.5 [49] cli_1.0.0 stringi_1.1.6 [51] XVector_0.18.0 xml2_1.1.1 [53] tools_3.4.1 bit64_0.9-7 [55] BSgenome_1.44.2 Biobase_2.38.0 [57] AnnotationDbi_1.40.0 memoise_1.1.0 [59] VariantAnnotation_1.24.5  bug genomicranges granges rlelist coverage • 1.2k views ADD COMMENT 2 Entering edit mode @herve-pages-1542 Last seen 4 hours ago Seattle, WA, United States Hi Xiaotong, Thanks for reporting this. Not sure what "incompatibility between the new IRanges and GenomicRanges packages" you are talking about. To me, it looks like this happens when some list elements in the RleList object representing the coverage have length 0: library(GenomicRanges) cov0 <- RleList(chr1=Rle(11:15), chr2=Rle(integer())) cov0 # RleList of length 2 #$chr1
# integer-Rle of length 5 with 5 runs
#   Lengths:  1  1  1  1  1
#   Values : 11 12 13 14 15
#
# \$chr2
# integer-Rle of length 0 with 0 runs
#   Lengths:
#   Values :

as(cov0, "GRanges")
# Error in .normargSeqlengths(value, seqnames(x)) :
#   length of supplied 'seqlengths' must equal the number of sequences


Getting an RleList object with zero-length list elements in turn happens when some underlying sequences in the original GRanges object don't receive any coverage AND have their length set to NA (or 0, but that would be very unusual).

This should be fixed in GenomicRanges 1.30.2 (https://github.com/Bioconductor/GenomicRanges/commit/7fc25cb0a2cbde158f2f5c5e8a6aca005aedf772), which should become available via biocLite() in the next 48 hours.

Cheers,

H.

0
Entering edit mode

Awesome, thanks!

The "incompatibility between the new IRanges and GenomicRanges packages" was refering to the recent changes in both packages like the default maxgap=-1 argument in findOverlaps and the suggested switch from ranges to overlapRanges method on Hits objects. I later read through the NEWS for 1.30.1 and see you guys have already resolved it. So no problem there since then.