How to convert a GRangesList object into a GRanges object?
1
4
Entering edit mode
richierocks ▴ 40
@richierocks-9994
Last seen 7.6 years ago

I can go from a GRanges object to a GRangesList object by splitting it. For example:

(gr <- GRanges(
  seqnames = paste0("chr", c(1, 2, 3, 4, 1, 2, 3, 1)),
  ranges  = IRanges(c(1:8), c(16:9)),
  strand = rep.int(c("+", "-"), 4)
))
## GRanges object with 8 ranges and 0 metadata columns:
##       seqnames    ranges strand
##          <Rle> <IRanges>  <Rle>
##   [1]     chr1   [1, 16]      +
##   [2]     chr2   [2, 15]      -
##   [3]     chr3   [3, 14]      +
##   [4]     chr4   [4, 13]      -
##   [5]     chr1   [5, 12]      +
##   [6]     chr2   [6, 11]      -
##   [7]     chr3   [7, 10]      +
##   [8]     chr1   [8,  9]      -
##   -------
##   seqinfo: 4 sequences from an unspecified genome; no seqlengths
(grl <- split(gr, seqnames(gr)))
## GRangesList object of length 4:
## $chr1
## GRanges object with 3 ranges and 0 metadata columns:
##       seqnames    ranges strand
##          <Rle> <IRanges>  <Rle>
##   [1]     chr1   [1, 16]      +
##   [2]     chr1   [5, 12]      +
##   [3]     chr1   [8,  9]      -
##
## $chr2
## GRanges object with 2 ranges and 0 metadata columns:
##       seqnames  ranges strand
##   [1]     chr2 [2, 15]      -
##   [2]     chr2 [6, 11]      -
##
## $chr3
## GRanges object with 2 ranges and 0 metadata columns:
##       seqnames  ranges strand
##   [1]     chr3 [3, 14]      +
##   [2]     chr3 [7, 10]      +
##
## ...
## <1 more element>
## -------
## seqinfo: 4 sequences from an unspecified genome; no seqlengths

(See also Converting single GRanges object as per chromosome GRangeList.)

How do I go concatenate the elements of a GRangesList object into a GRanges object?

I want to do:

do.call(rbind, grl)

But this throws an error

## Error in rbind2(..1) : no method for coercing this S4 class to a vector

Or possibly:

do.call(c, grl)

But this leaves grl unchanged.

genomicranges grangeslist granges s4 • 11k views
ADD COMMENT
0
Entering edit mode

Using do.call and c works, but you have to help R find the correct method method of c to use.

do.call(getMethod(c, "GenomicRanges"), grl)
## GRanges object with 8 ranges and 0 metadata columns:
##       seqnames    ranges strand
##          <Rle> <IRanges>  <Rle>
##   [1]     chr1   [1, 16]      +
##   [2]     chr1   [5, 12]      +
##   [3]     chr1   [8,  9]      -
##   [4]     chr2   [2, 15]      -
##   [5]     chr2   [6, 11]      -
##   [6]     chr3   [3, 14]      +
##   [7]     chr3   [7, 10]      +
##   [8]     chr4   [4, 13]      -
##   -------
##   seqinfo: 4 sequences from an unspecified genome; no seqlengths
ADD REPLY
0
Entering edit mode

I get

Error in h(simpleError(msg, call)) : error in evaluating the argument 'what' in selecting a method for function 'do.call': no method found for function 'c' and signature GenomicRanges

When I run this

ADD REPLY
0
Entering edit mode

That's because there's no c() method defined for GenomicRanges objects (maybe there was one 6 years ago when richierocks posted their comment):

> library(GenomicRanges)
> getMethod(c, "GenomicRanges")
Error in getMethod(c, "GenomicRanges") : 
  no method found for function 'c' and signature GenomicRanges

However there's one defined for Vector objects:

> selectMethod(c, "GenomicRanges")
Method Definition:

function (x, ...) 
{
    .local <- function (x, ..., ignore.mcols = FALSE, recursive = FALSE) 
    {
        if (!identical(recursive, FALSE)) 
            stop(wmsg("\"c\" method for Vector objects ", "does not support the 'recursive' argument"))
        bindROWS(x, list(...), ignore.mcols = ignore.mcols)
    }
    .local(x, ...)
}
<bytecode: 0x56340502f7c0>
<environment: namespace:S4Vectors>

Signatures:
        x              
target  "GenomicRanges"
defined "Vector"

And because the method dispatch mechanism in R uses inheritance to find the appropriate method, there's no need to use getMethod() or selectMethod():

> do.call(c, grl)
$gr1
GRanges object with 1 range and 2 metadata columns:
      seqnames    ranges strand |     score        GC
         <Rle> <IRanges>  <Rle> | <integer> <numeric>
  [1]   Chrom2       3-6      + |         5      0.45
  -------
  seqinfo: 2 sequences from an unspecified genome; no seqlengths

$gr2
GRanges object with 2 ranges and 2 metadata columns:
      seqnames    ranges strand |     score        GC
         <Rle> <IRanges>  <Rle> | <integer> <numeric>
  [1]   Chrom1       7-9      + |         3       0.3
  [2]   Chrom1     13-15      - |         4       0.5
  -------
  seqinfo: 2 sequences from an unspecified genome; no seqlengths

$gr3
GRanges object with 2 ranges and 2 metadata columns:
      seqnames    ranges strand |     score        GC
         <Rle> <IRanges>  <Rle> | <integer> <numeric>
  [1]   Chrom1       1-3      - |         6       0.4
  [2]   Chrom2       4-9      - |         2       0.1
  -------
  seqinfo: 2 sequences from an unspecified genome; no seqlengths

You should in fact _never_ use getMethod() or selectMethod() in your code. Just call the generic function (c() in this case) and let the method dispatch mechanism find the right method for you.

This is with GenomicRanges 1.46.1 which is the most current version of GenomicRanges in BioC 3.14.

Anyways, why are you trying to do this when Martin Morgan provided the best way to do this below? Using unlist() is _the_ recommended way and is much more efficient than do.call(c, ...).

H.

ADD REPLY
11
Entering edit mode
@martin-morgan-1513
Last seen 6 weeks ago
United States

Unlist the list

> unlist(grl)
GRanges object with 8 ranges and 0 metadata columns:
       seqnames    ranges strand
          <Rle> <IRanges>  <Rle>
  chr1     chr1   [1, 16]      +
  chr1     chr1   [5, 12]      +
  chr1     chr1   [8,  9]      -
  chr2     chr2   [2, 15]      -
  chr2     chr2   [6, 11]      -
  chr3     chr3   [3, 14]      +
  chr3     chr3   [7, 10]      +
  chr4     chr4   [4, 13]      -
  -------
  seqinfo: 4 sequences from an unspecified genome; no seqlengths

In the S4Vectors world, this unlist operation is very fast, because the 'list' is actually a single instance with a 'partitioning' vector describing how the elements are supposed to be grouped; split() adds the partitioning vector, unlist() removes it. relist() takes the partitioning of the second argument and adds it to the first.

 

ADD COMMENT

Login before adding your answer.

Traffic: 734 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6