Using Keggrest to convert ncbi gene IDs to kegg orthology and pathway
2
0
Entering edit mode
@francescadefilippis-7043
Last seen 9.3 years ago
European Union

Hi! I have a list of gene IDs from ncbi (starting with >gi....). They are from metagenome analysis, so belong to different bacteria genomes. I'd like to get the corresponding kegg orthology ID (KO....) and kegg pathways (ko....). Can the keggrest package do it? I found only examples specifying the organism name, but I have many different bacterial species... Is it possible concurrently doing a search for more species?

Thanks!

francesca

keggrest kegg keggorthology • 7.3k views
ADD COMMENT
0
Entering edit mode
Dan Tenenbaum ★ 8.2k
@dan-tenenbaum-4256
Last seen 3.2 years ago
United States

Yes, here is a way to convert a gi nujmber to a KEGG ID. The single GI number here could be replaced with a character vector. Once you have these you can look up orthology and  pathways:

> keggConv("genes", "ncbi-gi:15804211")
ncbi-gi:15804211 
     "ece:Z5100" 

# now find pathway:

> keggLink('pathway',  "ece:Z5100" )

      ece:Z5100 
"path:ece05130" 

 

Dan

 

ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 2 hours ago
United States

The IDs you have are not Gene IDs (which is what NCBI is calling Entrez Gene IDs these days), but are instead  GeneInfo Identifiers, which are a different beast altogether. However, it appears that KEGGREST will use the GI numbers. You just have to use keggConv(). And fortunately for us, there is an example of how to do this in the help page for keggConv():

 

Examples:

     head(keggConv("eco", "ncbi-geneid")) ## conversion from NCBI GeneID to
                                    ## KEGG ID for E. coli genes
     head(keggConv("ncbi-geneid", "eco")) ## opposite direction
     head(keggConv("ncbi-gi", c("hsa:10458", "ece:Z5100"))) ## conversion from KEGG ID
                                                      ## to NCBI GI
     ## conversion from NCBI GI to KEGG ID when the organism code is not known:
     head(keggConv("genes", "ncbi-geneid:3113320"))

 

And evidently you can do more than one species:

> keggConv("genes", c("ncbi-gi:222080100", "ncbi-gi:15804211"))
ncbi-gi:222080100  ncbi-gi:15804211
      "hsa:10458"       "ece:Z5100"

So it looks like a two step process. Convert your GI numbers to KEGG IDs, and then do the lookup using keggGet(). You will then have to use e.g. lapply()  to get out the exact data you want, but that is just a base R problem
.

ADD COMMENT

Login before adding your answer.

Traffic: 648 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6