I have RNA-Seq data from a prokaryotic non-model organism (Microbacterium) and I am doing a gene set enrichment analysis. I mapped my amino acid sequences to KO annotations first. I then managed to do the gene set enrichment analysis by using
organism = 'ko'.
gse_kegg <- gseKEGG( geneList = geneList, organism = 'ko', minGSSize = 120, pvalueCutoff = 0.05, verbose = FALSE)
The output is somewhat unspecific (e.g. upregulated is "biosynthesis of secondary metabolites") and thus not very useful.
My second thought is that I could potentially make use of a closely related genome that is KEGG annotated and thus listed in https://www.genome.jp/kegg/catalog/org_list.html. However, I don't have the gene mapping between the reference and my sequences.
For example, when I try:
gse_kegg <- gseKEGG( geneList = geneList, organism = 'mfol', key = 'kegg', minGSSize = 120, pvalueCutoff = 0.05, verbose = FALSE)
I obviously get an error:
Expected input gene ID: ,DXT68_06835,DXT68_14490, because my gene names are not mapped to the KEGG names of the reference.
Does anybody have an idea, how I can use the organism
mfol with my input genes? How can I map my genes to e.g.
gene ID: DXT68_15070?
Any help would be greatly appreciated.