I have a huge list of gene names, and I'd like to map corresponding gene IDs to each name. I've tried using this R library:
org.Hs.eg.db, but it creates more IDs than names, making it hard to map the results together, especially if the list is long.
Example of an input file (7 gene names):
RPS6KB2 PSME4 PDE4DIP APMAP TNRC18 PPP1R26 NAA20
Ideal output would be (7 IDs):
6199 23198 9659 57136 84629 9858 51126
Current output (8 IDs !!):
6199 23198 9659 57136 27320 *undesired output ID* 84629 9858 51126
Any suggestions on how to solve this issue? how to get rid of such multiple maps?
This is the code I'm using:
library("org.Hs.eg.db") #load the library input <- read.csv("myfile.csv",TRUE,",") #read input file GeneCol = as.character(input$Gene.name) #access the column that has gene names in my file output = unlist(mget(x = GeneCol, envir = org.Hs.egALIAS2EG, ifnotfound=NA)) #get IDs write.csv(output, file = "GeneIDs.csv") #write the list of IDs to a CSV file