Question

How to change seqlevels in a GRanges object

0

Entering edit mode

Haiying.Kong ▴ 110

@haiyingkong-9254

Last seen 5.0 years ago

Germany

I need to get rid of "chr" in seqlevels.

I tried many things, including:

renameSeqlevels( gr, gsub("chr", "", seqlevels(gr)))

none of them worked.

GRanges • 9.7k views

ADD COMMENT • link updated 6.8 years ago by Ou, Jianhong ★ 1.3k • written 6.8 years ago by Haiying.Kong ▴ 110

score 1 · Answer 1 · 2017-06-30

1

Entering edit mode

Ou, Jianhong ★ 1.3k

@ou-jianhong-4539

Last seen 3 days ago

United States

Did you tried seqlevelsStyle?

ADD COMMENT • link 6.8 years ago Ou, Jianhong ★ 1.3k

0

Entering edit mode

In other words,

seqlevelsStyle(gr) <- "NCBI"

ADD REPLY • link 6.8 years ago Michael Lawrence ★ 11k

0

Entering edit mode

I tried a second ago, and got:

> anno.database
GRanges object with 58153 ranges and 1 metadata column:
seqnames ranges strand | gene_name
<Rle> <IRanges> <Rle> | <character>
DDX11L1 chr1 [11869, 14412] + | DDX11L1
WASH7P chr1 [14363, 29806] - | WASH7P
MIR1302-10 chr1 [29554, 31109] + | MIR1302-10
FAM138A chr1 [34554, 36081] - | FAM138A
OR4G4P chr1 [52473, 54936] + | OR4G4P
... ... ... ... . ...
CYCSP49 chrY [28695572, 28695890] + | CYCSP49
SLC25A15P1 chrY [28732789, 28737748] - | SLC25A15P1
PARP4P1 chrY [28740998, 28780799] - | PARP4P1
FAM58CP chrY [28772667, 28773306] - | FAM58CP
CTBP2P1 chrY [59001391, 59001635] + | CTBP2P1
-------
seqinfo: 24 sequences from GRCh37 genome
> seqlevelsStyle(anno.database) <- "NCBI"
Error in `levels<-.factor`(`*tmp*`, value = c("1", "2", "3", "4", "5", :
number of levels differs

The same error I got when I tried to give values to seqlevels(anno.database) or seqnames(seqinfo(anno.database))

ADD REPLY • link 6.8 years ago Haiying.Kong ▴ 110

0

Entering edit mode

I found what the problem is.

The original GRangres object is:

anno.database = toGRanges(EnsDb.Hsapiens.v75, feature="gene")

if I run

seqlevelsStyle(anno.database) <- "NCBI"

This works.

But I deleted some rows in the object to keep only Chr1:22, X, Y. If I run seqlevelsStyle after deleting process, it gives that error message in my previous message.

Of course, I can do filtering after changing seqlevels, but curious why the second method should not work? I remember I had problems that stemmed from the same point here. I feel that when I filter rows of GRanges object, I need to make changes on some other information.

Here is how I filter rows for a GRanges object:

idx = which(seqnames(anno.database) %in% c(as.character(1:22),"X","Y"))
anno.database = anno.database[idx, ]
anno.database@seqinfo = seqinfo(anno.database)[c(as.character(1:22),"X","Y"), ]

ADD REPLY • link 6.8 years ago Haiying.Kong ▴ 110

0

Entering edit mode

GRanges are 'vectors' with single-bracket subsetting. Also, it's not obvious where toGRanges() comes from, it is not necessary to use which(), and one should never directly access slots via @. Thus

> gr = genes(EnsDb.Hsapiens.v75)
> gr1 = gr[seqnames(gr) %in% c(1:22, "X", "Y")]

then maybe

> seqlevels(gr1) = as.character(unique(seqnames(gr1)))

or in one line (maybe some consideration needs to be given to pruning.mode for GRangesLists).

keepSeqlevels(gr, c(1:22, "X", "Y"), pruning.mode="coarse")

ADD REPLY • link 6.8 years ago Martin Morgan 25k

0

Entering edit mode

Or even shorter:

keepStandardChromosomes(gr, pruning.mode="coarse")

ADD REPLY • link 6.8 years ago Michael Lawrence ★ 11k