convert data frame of Entrez IDs to Gene Symbols
3
1
Entering edit mode
Guido Hooiveld ★ 4.0k
@guido-hooiveld-2020
Last seen 9 hours ago
Wageningen University, Wageningen, the …

I am struggling with likely a simple problem, but I can't get it to work... I would appreciate if one of the more knowledgeable people could provide some code/directions to get it working...

Thanks, Guido

 

I have a data frame consisting of human Entrez IDs, which I would like to convert into Gene Symbols. NAs should preferably be removed. How to best do this? the select() function only accepts a character vector... (as expected).

 

> library(org.Hs.eg.db)

> myGenes[1:3,1:15]
     V1   V2    V3   V4   V5    V6   V7  V8  V9  V10  V11  V12  V13 V14  V15
1  1572 1571  4129  216  316   217   15 3620 7453  847 3028 1573 1576  38 3033
2 51478 3295  3294 3293 3284  3292 3283 2165 1586 1369   NA   NA   NA  NA   NA
3    10 6799 54600 6817 1544 54657   NA   NA   NA   NA   NA   NA   NA  NA   NA
> class(myGenes)
[1] "data.frame"
> anno.result <- select(org.Hs.eg.db, keys=myGenes, columns=c("SYMBOL"),keytype="ENTREZID")
Error in .testForValidKeys(x, keys, keytype, fks) :
  'keys' must be a character vector
>

 

annotation • 6.4k views
ADD COMMENT
0
Entering edit mode

@Aaron and Leandro,

Thanks for your replies. Although both suggestions do work, it doesn't work for my use case (which turned out not to be sufficiently clear): I would like to get back the same data frame/matrix, but with the IDs replaced by the corresponding Symbols. Both your suggestions "only" return a 'conversion table' (i.e. a data frame with only 2 columns; ID vs Symbol)..

ADD REPLY
2
Entering edit mode
@james-w-macdonald-5106
Last seen 2 hours ago
United States

It's just a matter of coercion, isn't it?

> myGenes <- data.frame(matrix(c(1572, 1571, 4129, NA, 216, 217, 15, 3620, 7453,NA), 2))
> myGenes
    X1   X2  X3   X4   X5
1 1572 4129 216   15 7453
2 1571   NA 217 3620   NA

> myGenes2 <- data.frame(matrix(mapIds(org.Hs.eg.db, as.character(unlist(myGenes)), "SYMBOL","ENTREZID"), 2))
'select()' returned 1:1 mapping between keys and columns
> myGenes2
      X1   X2      X3    X4   X5
1 CYP2F1 MAOB ALDH1A1 AANAT WARS
2 CYP2E1 NULL   ALDH2  IDO1 NULL

Or do I misunderstand what you want to end up with?

ADD COMMENT
0
Entering edit mode

No, thanks; working nicely... for the archive: when rearranging the matrix with mapped results, be sure to use the correct number of rows (= same as your input). :)

ADD REPLY
0
Entering edit mode
Aaron Lun ★ 28k
@alun
Last seen 9 hours ago
The city by the bay

Not the prettiest sight, but a coercion to a matrix followed by a coercion to a character vector should work:

anno.result <- select(org.Hs.eg.db, keys=as.character(as.matrix(myGenes)), 
    columns="SYMBOL", keytype="ENTREZID")
anno.result <- anno.result[!is.na(anno.result$ENTREZID),]

You could also use unlist instead of as.matrix, if you want to make the code a bit more exciting.

ADD COMMENT
0
Entering edit mode

In response to comment; if you want to preserve the original dimensions of myGenes, you can do something like this:

coerced <- as.character(as.matrix(myGenes))
anno.result <- select(org.Hs.eg.db, keys=coerced, columns="SYMBOL", keytype="ENTREZID")
new.symbols <- anno.result$SYMBOL[match(coerced, anno.result$ENTREZID)]
​dim(new.symbols) <- dim(myGenes)

The match just ensures that you only get one symbol per gene ID, if there is a one-to-many mapping for any ID.

ADD REPLY
0
Entering edit mode
@leandromartins-12178
Last seen 7.5 years ago
Brazil, Vitória da Conquista, Universid…

I like using the package mygene and function getgenes().

here is my suggestion

library(mygene) #vector with genes ids

genes <- c(1572, 1571,  4129,  216,  316,   217,   15, 3620, 7453,  847, 3028, 1573, 1576,  38, 3033)

#retrieve gene information

result <- getGenes(geneid = genes,         fields = c("symbol"))

class(result)

#symbols from genes

result$symbol​

ADD COMMENT

Login before adding your answer.

Traffic: 750 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6