Get Gene IDs for a list of Gene Names in R
1
0
Entering edit mode
@bayram-sarilmaz-16272
Last seen 5.9 years ago
turkey

I have a huge list of gene names, and I'd like to map corresponding gene IDs to each name. I've tried using this R library: org.Hs.eg.db, but it creates more IDs than names, making it hard to map the results together, especially if the list is long.

Example of an input file (7 gene names):

RPS6KB2
PSME4
PDE4DIP
APMAP
TNRC18
PPP1R26
NAA20

Ideal output would be (7 IDs):

6199
23198
9659
57136
84629
9858
51126

Current output (8 IDs !!):

6199
23198
9659
57136
27320 *undesired output ID*
84629
9858
51126

Any suggestions on how to solve this issue? how to get rid of such multiple maps?

This is the code I'm using:

library("org.Hs.eg.db") #load the library

input <- read.csv("myfile.csv",TRUE,",") #read input file

GeneCol = as.character(input$Gene.name) #access the column that has gene names in my file

output = unlist(mget(x = GeneCol, envir = org.Hs.egALIAS2EG, ifnotfound=NA)) #get IDs

write.csv(output, file = "GeneIDs.csv") #write the list of IDs to a CSV file
r bioinformatics org.Hs.eg.db genetics • 1.6k views
ADD COMMENT
0
Entering edit mode
@martin-morgan-1513
Last seen 5 weeks ago
United States

Please see my (updated, in response to your code) answer on StackOverflow.

ADD COMMENT

Login before adding your answer.

Traffic: 666 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6