Search
Question: How to change seqlevels in a GRanges object
0
gravatar for Haiying.Kong
12 months ago by
Haiying.Kong100
Germany
Haiying.Kong100 wrote:

I need to get rid of "chr" in seqlevels.

I tried many things, including:

renameSeqlevels( gr, gsub("chr", "", seqlevels(gr)))

none of them worked.

ADD COMMENTlink modified 12 months ago by Ou, Jianhong1.1k • written 12 months ago by Haiying.Kong100
0
gravatar for Ou, Jianhong
12 months ago by
Ou, Jianhong1.1k
United States
Ou, Jianhong1.1k wrote:

Did you tried seqlevelsStyle?

ADD COMMENTlink written 12 months ago by Ou, Jianhong1.1k

In other words,

seqlevelsStyle(gr) <- "NCBI"

 

ADD REPLYlink written 12 months ago by Michael Lawrence10k

I tried a second ago, and got:

> anno.database
GRanges object with 58153 ranges and 1 metadata column:
             seqnames               ranges strand |   gene_name
                <Rle>            <IRanges>  <Rle> | <character>
     DDX11L1     chr1       [11869, 14412]      + |     DDX11L1
      WASH7P     chr1       [14363, 29806]      - |      WASH7P
  MIR1302-10     chr1       [29554, 31109]      + |  MIR1302-10
     FAM138A     chr1       [34554, 36081]      - |     FAM138A
      OR4G4P     chr1       [52473, 54936]      + |      OR4G4P
         ...      ...                  ...    ... .         ...
     CYCSP49     chrY [28695572, 28695890]      + |     CYCSP49
  SLC25A15P1     chrY [28732789, 28737748]      - |  SLC25A15P1
     PARP4P1     chrY [28740998, 28780799]      - |     PARP4P1
     FAM58CP     chrY [28772667, 28773306]      - |     FAM58CP
     CTBP2P1     chrY [59001391, 59001635]      + |     CTBP2P1
  -------
  seqinfo: 24 sequences from GRCh37 genome
> seqlevelsStyle(anno.database) <- "NCBI"
Error in `levels<-.factor`(`*tmp*`, value = c("1", "2", "3", "4", "5",  :
  number of levels differs

  The same error I got when I tried to give values to seqlevels(anno.database) or seqnames(seqinfo(anno.database))

 

 

ADD REPLYlink written 12 months ago by Haiying.Kong100

I found what the problem is.

The original GRangres object is:

anno.database = toGRanges(EnsDb.Hsapiens.v75, feature="gene")

if I run 

seqlevelsStyle(anno.database) <- "NCBI"

This works.

But I deleted some rows in the object to keep only Chr1:22, X, Y. If I run seqlevelsStyle after deleting process, it gives that error message in my previous message.

Of course, I can do filtering after changing seqlevels, but curious why the second method should not work? I remember I had problems that stemmed from the same point here. I feel that when I filter rows of GRanges object, I need to make changes on some other information.

Here is how I filter rows for a GRanges object:

idx = which(seqnames(anno.database) %in% c(as.character(1:22),"X","Y"))
anno.database = anno.database[idx, ]
anno.database@seqinfo = seqinfo(anno.database)[c(as.character(1:22),"X","Y"), ]

 

 

ADD REPLYlink modified 12 months ago • written 12 months ago by Haiying.Kong100

GRanges are 'vectors' with single-bracket subsetting. Also, it's not obvious where toGRanges() comes from, it is not necessary to use which(), and one should never directly access slots via @. Thus

> gr = genes(EnsDb.Hsapiens.v75)
> gr1 = gr[seqnames(gr) %in% c(1:22, "X", "Y")]

then maybe

> seqlevels(gr1) = as.character(unique(seqnames(gr1)))

or in one line (maybe some consideration needs to be given to pruning.mode for GRangesLists).

keepSeqlevels(gr, c(1:22, "X", "Y"), pruning.mode="coarse")

 

ADD REPLYlink written 12 months ago by Martin Morgan ♦♦ 22k

Or even shorter:

keepStandardChromosomes(gr, pruning.mode="coarse")
ADD REPLYlink written 12 months ago by Michael Lawrence10k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 157 users visited in the last hour