Search
Question: How to change seqlevels in a GRanges object
0
gravatar for Haiying.Kong
4 months ago by
Haiying.Kong90
Germany
Haiying.Kong90 wrote:

I need to get rid of "chr" in seqlevels.

I tried many things, including:

renameSeqlevels( gr, gsub("chr", "", seqlevels(gr)))

none of them worked.

ADD COMMENTlink modified 4 months ago by Ou, Jianhong1000 • written 4 months ago by Haiying.Kong90
0
gravatar for Ou, Jianhong
4 months ago by
Ou, Jianhong1000
United States
Ou, Jianhong1000 wrote:

Did you tried seqlevelsStyle?

ADD COMMENTlink written 4 months ago by Ou, Jianhong1000

In other words,

seqlevelsStyle(gr) <- "NCBI"

 

ADD REPLYlink written 4 months ago by Michael Lawrence9.8k

I tried a second ago, and got:

> anno.database
GRanges object with 58153 ranges and 1 metadata column:
             seqnames               ranges strand |   gene_name
                <Rle>            <IRanges>  <Rle> | <character>
     DDX11L1     chr1       [11869, 14412]      + |     DDX11L1
      WASH7P     chr1       [14363, 29806]      - |      WASH7P
  MIR1302-10     chr1       [29554, 31109]      + |  MIR1302-10
     FAM138A     chr1       [34554, 36081]      - |     FAM138A
      OR4G4P     chr1       [52473, 54936]      + |      OR4G4P
         ...      ...                  ...    ... .         ...
     CYCSP49     chrY [28695572, 28695890]      + |     CYCSP49
  SLC25A15P1     chrY [28732789, 28737748]      - |  SLC25A15P1
     PARP4P1     chrY [28740998, 28780799]      - |     PARP4P1
     FAM58CP     chrY [28772667, 28773306]      - |     FAM58CP
     CTBP2P1     chrY [59001391, 59001635]      + |     CTBP2P1
  -------
  seqinfo: 24 sequences from GRCh37 genome
> seqlevelsStyle(anno.database) <- "NCBI"
Error in `levels<-.factor`(`*tmp*`, value = c("1", "2", "3", "4", "5",  :
  number of levels differs

  The same error I got when I tried to give values to seqlevels(anno.database) or seqnames(seqinfo(anno.database))

 

 

ADD REPLYlink written 4 months ago by Haiying.Kong90

I found what the problem is.

The original GRangres object is:

anno.database = toGRanges(EnsDb.Hsapiens.v75, feature="gene")

if I run 

seqlevelsStyle(anno.database) <- "NCBI"

This works.

But I deleted some rows in the object to keep only Chr1:22, X, Y. If I run seqlevelsStyle after deleting process, it gives that error message in my previous message.

Of course, I can do filtering after changing seqlevels, but curious why the second method should not work? I remember I had problems that stemmed from the same point here. I feel that when I filter rows of GRanges object, I need to make changes on some other information.

Here is how I filter rows for a GRanges object:

idx = which(seqnames(anno.database) %in% c(as.character(1:22),"X","Y"))
anno.database = anno.database[idx, ]
anno.database@seqinfo = seqinfo(anno.database)[c(as.character(1:22),"X","Y"), ]

 

 

ADD REPLYlink modified 4 months ago • written 4 months ago by Haiying.Kong90

GRanges are 'vectors' with single-bracket subsetting. Also, it's not obvious where toGRanges() comes from, it is not necessary to use which(), and one should never directly access slots via @. Thus

> gr = genes(EnsDb.Hsapiens.v75)
> gr1 = gr[seqnames(gr) %in% c(1:22, "X", "Y")]

then maybe

> seqlevels(gr1) = as.character(unique(seqnames(gr1)))

or in one line (maybe some consideration needs to be given to pruning.mode for GRangesLists).

keepSeqlevels(gr, c(1:22, "X", "Y"), pruning.mode="coarse")

 

ADD REPLYlink written 4 months ago by Martin Morgan ♦♦ 20k

Or even shorter:

keepStandardChromosomes(gr, pruning.mode="coarse")
ADD REPLYlink written 4 months ago by Michael Lawrence9.8k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 188 users visited in the last hour