extracting the KEGG pathway for a set of genes
1
0
Entering edit mode
Bogdan ▴ 670
@bogdan-2367
Last seen 14 months ago
Palo Alto, CA, USA

Dear all, please would you advise :

given a set of gene names, what is the best way to extract the KEGG pathway that is associated with each gene ?

thank you,

-- bogdan

Pathways KEGG • 4.6k views
ADD COMMENT
0
Entering edit mode
Kevin Blighe ★ 4.0k
@kevin
Last seen 8 days ago
Republic of Ireland

Hi Bogdan,

I use KEGGprofile. The main function in KEGGprofile is find_enriched_pathway(), and it by default accepts a vector of Entrez gene IDs. However, the original annotation package used by KEGGprofile (KEGG.db) was deprecated in Bioconductor 3.12.

Another popular option is, of course, clusterProfiler: http://yulab-smu.top/clusterProfiler-book/chapter6.html

Kevin

ADD COMMENT
0
Entering edit mode

Hi Kevin, For some strange reason KEGGprofile is not available for the latest version of R.

install.packages("KEGGprofile") Warning in install.packages : package ‘KEGGprofile’ is not available for this version of R

A version of this package for your version of R might be available elsewhere, see the ideas at https://cran.r-project.org/doc/manuals/r-patched/R-admin.html#Installing-packages

When I tried up looking in cran:

av <- available.packages(filters=list()) av[av[, "Package"] == "KEGGprofile", ] Package Version Priority Depends Imports LinkingTo Suggests Enhances License License_is_FOSS License_restricts_use OS_type Archs MD5sum NeedsCompilation File Repository

I am getting the above mentioned reason. Please do let me know how can i resolve this.

ADD REPLY
0
Entering edit mode

Please re-read Kevin's reply. The KEGG.db package has been deprecated because it used data from over 6 years ago (before the paywall for KEGG data was established). For whatever reason the maintainer for KEGGprofile has not updated their package to use alternative methods.

There is the KEGGREST package, but for simple queries it is pretty cumbersome. I tend to just get the mappings directly using functions from limma and then proceed from there. You don't say the species, so I will imagine you want human

> library(limma)
> z <- getGeneKEGGLinks("hsa")
> head(z)
  GeneID     PathwayID
1  10327 path:hsa00010
2    124 path:hsa00010
3    125 path:hsa00010
4    126 path:hsa00010
5    127 path:hsa00010
6    128 path:hsa00010

> zlst <- split(z[,2], z[,1])
> zlst[1:5]
$`10`
[1] "path:hsa00232" "path:hsa00983" "path:hsa01100" "path:hsa05204"

$`100`
[1] "path:hsa00230" "path:hsa01100" "path:hsa05340"

$`1000`
[1] "path:hsa04514" "path:hsa05412"

$`10000`
 [1] "path:hsa01521" "path:hsa01522" "path:hsa01524" "path:hsa04010"
 [5] "path:hsa04012" "path:hsa04014" "path:hsa04015" "path:hsa04022"
 [9] "path:hsa04024" "path:hsa04062" "path:hsa04066" "path:hsa04068"
[13] "path:hsa04071" "path:hsa04072" "path:hsa04140" "path:hsa04150"
[17] "path:hsa04151" "path:hsa04152" "path:hsa04210" "path:hsa04211"
[21] "path:hsa04213" "path:hsa04218" "path:hsa04261" "path:hsa04370"
[25] "path:hsa04371" "path:hsa04380" "path:hsa04510" "path:hsa04550"
[29] "path:hsa04611" "path:hsa04613" "path:hsa04620" "path:hsa04625"
[33] "path:hsa04630" "path:hsa04660" "path:hsa04662" "path:hsa04664"
[37] "path:hsa04666" "path:hsa04668" "path:hsa04722" "path:hsa04725"
[41] "path:hsa04728" "path:hsa04910" "path:hsa04914" "path:hsa04915"
[45] "path:hsa04917" "path:hsa04919" "path:hsa04920" "path:hsa04922"
[49] "path:hsa04923" "path:hsa04926" "path:hsa04929" "path:hsa04931"
[53] "path:hsa04932" "path:hsa04933" "path:hsa04935" "path:hsa04973"
[57] "path:hsa05010" "path:hsa05017" "path:hsa05131" "path:hsa05132"
[61] "path:hsa05135" "path:hsa05142" "path:hsa05145" "path:hsa05152"
[65] "path:hsa05160" "path:hsa05161" "path:hsa05162" "path:hsa05163"
[69] "path:hsa05164" "path:hsa05165" "path:hsa05166" "path:hsa05167"
[73] "path:hsa05168" "path:hsa05169" "path:hsa05170" "path:hsa05200"
[77] "path:hsa05205" "path:hsa05207" "path:hsa05208" "path:hsa05210"
[81] "path:hsa05211" "path:hsa05212" "path:hsa05213" "path:hsa05214"
[85] "path:hsa05215" "path:hsa05218" "path:hsa05220" "path:hsa05221"
[89] "path:hsa05222" "path:hsa05223" "path:hsa05224" "path:hsa05225"
[93] "path:hsa05226" "path:hsa05230" "path:hsa05231" "path:hsa05235"
[97] "path:hsa05415" "path:hsa05417" "path:hsa05418"

$`100008587`
[1] "path:hsa03008" "path:hsa03010"

# And if you need the pathway names

> zz <- getKEGGPathwayNames("hsa")
> head(zz)
      PathwayID                                                     Description
1 path:hsa00010             Glycolysis / Gluconeogenesis - Homo sapiens (human)
2 path:hsa00020                Citrate cycle (TCA cycle) - Homo sapiens (human)
3 path:hsa00030                Pentose phosphate pathway - Homo sapiens (human)
4 path:hsa00040 Pentose and glucuronate interconversions - Homo sapiens (human)
5 path:hsa00051          Fructose and mannose metabolism - Homo sapiens (human)
6 path:hsa00052                     Galactose metabolism - Homo sapiens (human)
ADD REPLY
0
Entering edit mode

Dear gentlemen, thank you for your replies. I have followed the part on assigning the pathway names to pathways ID;

however, given a gene, how shall I find the pathway that is associated to.

is there a way to use the gmt files for example ? or any other resources ?

ADD REPLY
0
Entering edit mode

You can also use the kegg_pathway_annotations function from the OmnipathR package: https://saezlab.github.io/OmnipathR/reference/kegg_pathway_annotations.html

See more KEGG related functions here: https://saezlab.github.io/OmnipathR/reference/, all prefixed with kegg_

ADD REPLY
0
Entering edit mode

Thanks a lot, gentlemen ! with much appreciation :)

ADD REPLY

Login before adding your answer.

Traffic: 677 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6