I need to get rid of "chr" in seqlevels.
I tried many things, including:
renameSeqlevels( gr, gsub("chr", "", seqlevels(gr)))
none of them worked.
Did you tried seqlevelsStyle?
In other words,
seqlevelsStyle(gr) <- "NCBI"
I tried a second ago, and got:
GRanges object with 58153 ranges and 1 metadata column:
seqnames ranges strand | gene_name
<Rle> <IRanges> <Rle> | <character>
DDX11L1 chr1 [11869, 14412] + | DDX11L1
WASH7P chr1 [14363, 29806] - | WASH7P
MIR1302-10 chr1 [29554, 31109] + | MIR1302-10
FAM138A chr1 [34554, 36081] - | FAM138A
OR4G4P chr1 [52473, 54936] + | OR4G4P
... ... ... ... . ...
CYCSP49 chrY [28695572, 28695890] + | CYCSP49
SLC25A15P1 chrY [28732789, 28737748] - | SLC25A15P1
PARP4P1 chrY [28740998, 28780799] - | PARP4P1
FAM58CP chrY [28772667, 28773306] - | FAM58CP
CTBP2P1 chrY [59001391, 59001635] + | CTBP2P1
seqinfo: 24 sequences from GRCh37 genome
> seqlevelsStyle(anno.database) <- "NCBI"
Error in `levels<-.factor`(`*tmp*`, value = c("1", "2", "3", "4", "5", :
number of levels differs
The same error I got when I tried to give values to seqlevels(anno.database) or seqnames(seqinfo(anno.database))
I found what the problem is.
The original GRangres object is:
anno.database = toGRanges(EnsDb.Hsapiens.v75, feature="gene")
if I run
seqlevelsStyle(anno.database) <- "NCBI"
But I deleted some rows in the object to keep only Chr1:22, X, Y. If I run seqlevelsStyle after deleting process, it gives that error message in my previous message.
Of course, I can do filtering after changing seqlevels, but curious why the second method should not work? I remember I had problems that stemmed from the same point here. I feel that when I filter rows of GRanges object, I need to make changes on some other information.
Here is how I filter rows for a GRanges object:
idx = which(seqnames(anno.database) %in% c(as.character(1:22),"X","Y"))
anno.database = anno.database[idx, ]
anno.database@seqinfo = seqinfo(anno.database)[c(as.character(1:22),"X","Y"), ]
GRanges are 'vectors' with single-bracket subsetting. Also, it's not obvious where toGRanges() comes from, it is not necessary to use which(), and one should never directly access slots via @. Thus
> gr = genes(EnsDb.Hsapiens.v75)
> gr1 = gr[seqnames(gr) %in% c(1:22, "X", "Y")]
> seqlevels(gr1) = as.character(unique(seqnames(gr1)))
or in one line (maybe some consideration needs to be given to pruning.mode for GRangesLists).
keepSeqlevels(gr, c(1:22, "X", "Y"), pruning.mode="coarse")
Or even shorter: