How i can get a list of KEGG pathways and its list of genes?
2
1
Entering edit mode
malik.yousef ▴ 10
@malikyousef-16077
Last seen 5.6 years ago

Hi

How can i get the list of genes for each KEGG pathways? I need a simple text table that in each row I have the KEGG pathway and next column has the list of genes for this specific pathway.

Best

Malik

keggrest • 11k views
ADD COMMENT
1
Entering edit mode
@gordon-smyth
Last seen 12 hours ago
WEHI, Melbourne, Australia
> library(limma)
> tab <- getGeneKEGGLinks(species="hsa")
> tab$Symbol <- mapIds(org.Hs.eg.db, tab$GeneID,
                       column="SYMBOL", keytype="ENTREZID")
> head(tab)
  GeneID     PathwayID Symbol
1  10327 path:hsa00010 AKR1A1
2    124 path:hsa00010  ADH1A
3    125 path:hsa00010  ADH1B
4    126 path:hsa00010  ADH1C
5    127 path:hsa00010   ADH4
6    128 path:hsa00010   ADH5

To get names of the pathways:

> head(getKEGGPathwayNames(species="hsa"))
      PathwayID                                                     Description
1 path:hsa00010             Glycolysis / Gluconeogenesis - Homo sapiens (human)
2 path:hsa00020                Citrate cycle (TCA cycle) - Homo sapiens (human)
3 path:hsa00030                Pentose phosphate pathway - Homo sapiens (human)
4 path:hsa00040 Pentose and glucuronate interconversions - Homo sapiens (human)
5 path:hsa00051          Fructose and mannose metabolism - Homo sapiens (human)
6 path:hsa00052                     Galactose metabolism - Homo sapiens (human)

 

ADD COMMENT
0
Entering edit mode

Thanks for your reply.

How can i get gene symbol to match it for GeneId and also Kegg pathway name?

ADD REPLY
0
Entering edit mode

If you have two data.frames with the same things in both, it's trivial to match those up. See e.g., ?match

ADD REPLY
1
Entering edit mode
@martin-morgan-1513
Last seen 5 months ago
United States

Here I use KEGGREST for KEGG information, and org.Hs.eg.db for symbol mapping. The tidyverse is convenient for working with data.frames

library(KEGGREST)
library(org.Hs.eg.db)
library(tidyverse)     ## dplyr::select() vs. AnnotationDbi::select() !

These are the KEGG pathways and their Entrez gene ids

hsa_path_eg  <- keggLink("pathway", "hsa") %>% 
    tibble(pathway = ., eg = sub("hsa:", "", names(.)))

annotated with the SYMBOL and ENSEMBL identifiers associated with each Entrez id

hsa_kegg_anno <- hsa_path_eg %>%
    mutate(
        symbol = mapIds(org.Hs.eg.db, eg, "SYMBOL", "ENTREZID"),
        ensembl = mapIds(org.Hs.eg.db, eg, "ENSEMBL", "ENTREZID")
    )

This gives me

> hsa_kegg_anno
# A tibble: 29,424 x 4
   pathway       eg     symbol  ensembl        
   <chr>         <chr>  <chr>   <chr>          
 1 path:hsa00010 10327  AKR1A1  ENSG00000117448
 2 path:hsa00010 124    ADH1A   ENSG00000187758
 3 path:hsa00010 125    ADH1B   ENSG00000196616
 4 path:hsa00010 126    ADH1C   ENSG00000248144
 5 path:hsa00010 127    ADH4    ENSG00000198099
 6 path:hsa00010 128    ADH5    ENSG00000197894
 7 path:hsa00010 130    ADH6    ENSG00000172955
 8 path:hsa00010 130589 GALM    ENSG00000143891
 9 path:hsa00010 131    ADH7    ENSG00000196344
10 path:hsa00010 160287 LDHAL6A ENSG00000166800
# ... with 29,414 more rows

I can go back to KEGG for the pathway descriptions

hsa_pathways <- keggList("pathway", "hsa") %>% 
    tibble(pathway = names(.), description = .)

so

> hsa_pathways
# A tibble: 328 x 2
   pathway       description                                                   
   <chr>         <chr>                                                         
 1 path:hsa00010 Glycolysis / Gluconeogenesis - Homo sapiens (human)           
 2 path:hsa00020 Citrate cycle (TCA cycle) - Homo sapiens (human)              
 3 path:hsa00030 Pentose phosphate pathway - Homo sapiens (human)              
 4 path:hsa00040 Pentose and glucuronate interconversions - Homo sapiens (huma…
 5 path:hsa00051 Fructose and mannose metabolism - Homo sapiens (human)        
 6 path:hsa00052 Galactose metabolism - Homo sapiens (human)                   
 7 path:hsa00053 Ascorbate and aldarate metabolism - Homo sapiens (human)      
 8 path:hsa00061 Fatty acid biosynthesis - Homo sapiens (human)                
 9 path:hsa00062 Fatty acid elongation - Homo sapiens (human)                  
10 path:hsa00071 Fatty acid degradation - Homo sapiens (human)                 
# ... with 318 more rows

I could join these with the gene identifiers, if desired...

left_join(hsa_kegg_anno, hsa_pathways)
ADD COMMENT

Login before adding your answer.

Traffic: 559 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6