I am analysing RNA Seq data on an organism that is not represented in the KEGG database. I have carried out differential expression analysis using deseq2 and using BLASTKoala resources, assigned a K number KEGG ID to some of the genes that matched for pathway analysis. A small proportion of genes were assigned a K number (20-30%) of all DE genes. I have taken these genes forward for pathway enrichment analysis.
I have linked the deseq pipeline with the gage package in R and can do pathway enrichment analysis using gage() functions when kegg.gsets(species = "ko", id.type = "kegg", check.new=FALSE) functions are set and then visualise the results using Pathview.
The problem I have is this pipeline only works for unique K number IDs in the data matrix and multiple genes within my dataset have the same K number ID. Is it possible to do such analysis when multiple K number IDs are the same? If so, how do i link the IDs in my input data file to the kegg.gset for analysis?
Any advice would be much appreciated.