Search
Question: How to properly sort GRanges?
0
gravatar for enricoferrero
4.1 years ago by
enricoferrero570
Switzerland
enricoferrero570 wrote:

Hi,

I have a few GRanges that I need to sort based on their chromosomes/seqnames, start and end coordinates of the intervals.

I used to be able to do it in this way:

sort(gr, by = ~ seqnames + start + end)

But now I get this error:

Error in get(nm, parent, mode = "function") :
  object 'seqnames' of mode 'function' was not found

Is there anything wrong with my code?

What's the best way to reliably sort multiple GRanges objects in the same way (similar to bedtools sort or bedops sort-bed)?

Thanks!

ADD COMMENTlink modified 3.7 years ago by yinghua30 • written 4.1 years ago by enricoferrero570
4
gravatar for daniel.silvestre
4.1 years ago by
Brazil
daniel.silvestre60 wrote:

Firstly, verify that seqlevels are sorted:

gr <- sortSeqlevels(gr)

Then, just sort your GRanges object:

gr <- sort(gr)

Simple as that. You should take some time follow through the GenomicRanges vignettes. There are a lot of quite useful tricks inside

ADD COMMENTlink modified 4.1 years ago • written 4.1 years ago by daniel.silvestre60
1

Yep. By default sort() will sort a GRanges object by seqnames, strand, start, and end. If you want the strand to be ignored, use ignore.strand=TRUE:

gr <- GRanges("chr1", IRanges(c(4, 10), c(18, 15)), strand=c("-", "+"))

sort(gr)
# GRanges object with 2 ranges and 0 metadata columns:
#      seqnames    ranges strand
#         <Rle> <IRanges>  <Rle>
#  [1]     chr1  [10, 15]      +
#  [2]     chr1  [ 4, 18]      -
#  -------
#  seqinfo: 1 sequence from an unspecified genome; no seqlengths

sort(gr, ignore.strand=TRUE)
# GRanges object with 2 ranges and 0 metadata columns:
#      seqnames    ranges strand
#         <Rle> <IRanges>  <Rle>
#  [1]     chr1  [ 4, 18]      -
#  [2]     chr1  [10, 15]      +
#  -------
#  seqinfo: 1 sequence from an unspecified genome; no seqlengths

H.

ADD REPLYlink modified 3.7 years ago • written 3.7 years ago by Hervé Pagès ♦♦ 13k
3
gravatar for James W. MacDonald
4.1 years ago by
United States
James W. MacDonald48k wrote:

As long as your seqlevels are ordered correctly, sort() should do it.

> z <- GRanges(c("chr3","chr4","chr1"), IRanges(c(3,4,5), c(6,7,8)))
> seqlevels(z) <- sort(seqlevels(z))
> z
GRanges object with 3 ranges and 0 metadata columns:
      seqnames    ranges strand
         <Rle> <IRanges>  <Rle>
  [1]     chr3    [3, 6]      *
  [2]     chr4    [4, 7]      *
  [3]     chr1    [5, 8]      *
  -------
  seqinfo: 3 sequences from an unspecified genome; no seqlengths
> sort(z)
GRanges object with 3 ranges and 0 metadata columns:
      seqnames    ranges strand
         <Rle> <IRanges>  <Rle>
  [1]     chr1    [5, 8]      *
  [2]     chr3    [3, 6]      *
  [3]     chr4    [4, 7]      *
  -------
  seqinfo: 3 sequences from an unspecified genome; no seqlengths

 

ADD COMMENTlink written 4.1 years ago by James W. MacDonald48k
3
gravatar for yinghua
3.7 years ago by
yinghua30
United States
yinghua30 wrote:
The error message "object 'seqnames' of mode 'function' was not found" is probably due to the broken sort function. See GRanges manual page 17,

 

## TODO: Broken. Please fix!

#sort(gr, by = ~ score)

 

Clearly, sort by mcols was known as broken.

ADD COMMENTlink written 3.7 years ago by yinghua30
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 397 users visited in the last hour