How to properly sort GRanges?
3
1
Entering edit mode
enricoferrero ▴ 660
@enricoferrero-6037
Last seen 2.3 years ago
Switzerland

Hi,

I have a few GRanges that I need to sort based on their chromosomes/seqnames, start and end coordinates of the intervals.

I used to be able to do it in this way:

sort(gr, by = ~ seqnames + start + end)

But now I get this error:

Error in get(nm, parent, mode = "function") :
  object 'seqnames' of mode 'function' was not found

Is there anything wrong with my code?

What's the best way to reliably sort multiple GRanges objects in the same way (similar to bedtools sort or bedops sort-bed)?

Thanks!

granges genomicranges sort seqnames order • 16k views
ADD COMMENT
8
Entering edit mode
@danielsilvestre-6769
Last seen 8.6 years ago
Brazil

Firstly, verify that seqlevels are sorted:

gr <- sortSeqlevels(gr)

Then, just sort your GRanges object:

gr <- sort(gr)

Simple as that. You should take some time follow through the GenomicRanges vignettes. There are a lot of quite useful tricks inside

ADD COMMENT
2
Entering edit mode

Yep. By default sort() will sort a GRanges object by seqnames, strand, start, and end. If you want the strand to be ignored, use ignore.strand=TRUE:

gr <- GRanges("chr1", IRanges(c(4, 10), c(18, 15)), strand=c("-", "+"))

sort(gr)
# GRanges object with 2 ranges and 0 metadata columns:
#      seqnames    ranges strand
#         <Rle> <IRanges>  <Rle>
#  [1]     chr1  [10, 15]      +
#  [2]     chr1  [ 4, 18]      -
#  -------
#  seqinfo: 1 sequence from an unspecified genome; no seqlengths

sort(gr, ignore.strand=TRUE)
# GRanges object with 2 ranges and 0 metadata columns:
#      seqnames    ranges strand
#         <Rle> <IRanges>  <Rle>
#  [1]     chr1  [ 4, 18]      -
#  [2]     chr1  [10, 15]      +
#  -------
#  seqinfo: 1 sequence from an unspecified genome; no seqlengths

H.

ADD REPLY
3
Entering edit mode
@james-w-macdonald-5106
Last seen 9 hours ago
United States

As long as your seqlevels are ordered correctly, sort() should do it.

> z <- GRanges(c("chr3","chr4","chr1"), IRanges(c(3,4,5), c(6,7,8)))
> seqlevels(z) <- sort(seqlevels(z))
> z
GRanges object with 3 ranges and 0 metadata columns:
      seqnames    ranges strand
         <Rle> <IRanges>  <Rle>
  [1]     chr3    [3, 6]      *
  [2]     chr4    [4, 7]      *
  [3]     chr1    [5, 8]      *
  -------
  seqinfo: 3 sequences from an unspecified genome; no seqlengths
> sort(z)
GRanges object with 3 ranges and 0 metadata columns:
      seqnames    ranges strand
         <Rle> <IRanges>  <Rle>
  [1]     chr1    [5, 8]      *
  [2]     chr3    [3, 6]      *
  [3]     chr4    [4, 7]      *
  -------
  seqinfo: 3 sequences from an unspecified genome; no seqlengths

 

ADD COMMENT
3
Entering edit mode
yinghua ▴ 30
@yinghua-7593
Last seen 8.9 years ago
United States
The error message "object 'seqnames' of mode 'function' was not found" is probably due to the broken sort function. See GRanges manual page 17,

 

## TODO: Broken. Please fix!

#sort(gr, by = ~ score)

 

Clearly, sort by mcols was known as broken.

ADD COMMENT

Login before adding your answer.

Traffic: 618 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6