The metadata from rowRanges is missing in the MAP function of GenomicFiles.
1
1
Entering edit mode
Qiang ▴ 80
@qiang-9580
Last seen 3.4 years ago

Hi all,

I want to pass metadata ref in rowRanges to the MAP function when using reduceByFile from GenomicFiles.

fls <- tempfile()
gr <- GRanges("chr1", IRanges(c(1, 10), width = 10))
gr$ref <- c("a", "b")
gf <- GenomicFiles(files=fls, rowRanges=gr)

MAP <- function(range, file, ...){
    range
}

reduceByFile(gf, MAP = MAP)
## the same output
## reduceByFile(gr, fls, MAP = MAP)

Here are the outputs:

 [[1]]
 [[1]][[1]]
 GRanges object with 1 range and 0 metadata columns:
       seqnames    ranges strand
          <Rle> <IRanges>  <Rle>
   [1]     chr1      1-10      *
   -------
   seqinfo: 1 sequence from an unspecified genome; no seqlengths

 [[1]][[2]]
 GRanges object with 1 range and 0 metadata columns:
       seqnames    ranges strand
          <Rle> <IRanges>  <Rle>
   [1]     chr1     10-19      *
   -------
   seqinfo: 1 sequence from an unspecified genome; no seqlengths

The metadata column ref is missing when passing to the MAP function. How to pass the range corresponding metadata to the MAP function?

Thanks, Qiang

> sessionInfo()
 R version 3.6.0 (2019-04-26)
 Platform: x86_64-pc-linux-gnu (64-bit)
 Running under: CentOS release 6.4 (Final)

 Matrix products: default
 BLAS:   /home/qhu/usr/R-3.6/lib64/R/lib/libRblas.so
 LAPACK: /home/qhu/usr/R-3.6/lib64/R/lib/libRlapack.so

 locale:
  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
  [9] LC_ADDRESS=C               LC_TELEPHONE=C
 [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

 attached base packages:
 [1] stats4    parallel  stats     graphics  grDevices utils     datasets
 [8] methods   base

 other attached packages:
  [1] GenomicFiles_1.19.0         rtracklayer_1.43.3
  [3] Rsamtools_1.99.6            Biostrings_2.51.5
  [5] XVector_0.23.2              SummarizedExperiment_1.13.0
  [7] DelayedArray_0.9.9          BiocParallel_1.17.19
  [9] matrixStats_0.54.0          Biobase_2.43.1
 [11] GenomicRanges_1.35.1        GenomeInfoDb_1.19.3
 [13] IRanges_2.17.5              S4Vectors_0.21.23
 [15] BiocGenerics_0.29.2

 loaded via a namespace (and not attached):
  [1] Rcpp_1.0.1                compiler_3.6.0
  [3] prettyunits_1.0.2         GenomicFeatures_1.35.11
  [5] bitops_1.0-6              tools_3.6.0
  [7] zlibbioc_1.29.0           progress_1.2.0
  [9] biomaRt_2.39.3            digest_0.6.18
 [11] bit_1.1-14                BSgenome_1.51.0
 [13] RSQLite_2.1.1             memoise_1.1.0
 [15] lattice_0.20-38           pkgconfig_2.0.2
 [17] rlang_0.3.4               Matrix_1.2-17
 [19] DBI_1.0.0                 GenomeInfoDbData_1.2.1
 [21] httr_1.4.0                stringr_1.4.0
 [23] hms_0.4.2                 bit64_0.9-7
 [25] grid_3.6.0                R6_2.4.0
 [27] AnnotationDbi_1.45.1      XML_3.98-1.19
 [29] magrittr_1.5              blob_1.1.1
 [31] GenomicAlignments_1.19.1  assertthat_0.2.1
 [33] stringi_1.4.3             RCurl_1.95-4.12
 [35] VariantAnnotation_1.29.25 crayon_1.3.4

GenomicFiles • 740 views
ADD COMMENT
0
Entering edit mode
@martin-morgan-1513
Last seen 4 days ago
United States

Thanks for the nice reproducible example. Internally, the method coerces the GRanges to a GRangesList, but does so in a way that does not preserve metadata. A workaround would be to do the coercion ahead of time, e.g.,

grl <- splitAsList(gr, seq_along(gr))
reduceByFile(grl, files, MAP)
ADD COMMENT
0
Entering edit mode

The workaround works. Thanks!

ADD REPLY

Login before adding your answer.

Traffic: 549 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6