GSEA using clusterProfiler with non-model organism.
0
0
Entering edit mode
makrez • 0
@81dca918
Last seen 17 days ago
Switzerland

I have RNA-Seq data from a prokaryotic non-model organism (Microbacterium) and I am doing a gene set enrichment analysis. I mapped my amino acid sequences to KO annotations first. I then managed to do the gene set enrichment analysis by using organism = 'ko'.

gse_kegg <- gseKEGG(
geneList = geneList,
organism = 'ko',
minGSSize    = 120,
pvalueCutoff = 0.05,
verbose      = FALSE)


The output is somewhat unspecific (e.g. upregulated is "biosynthesis of secondary metabolites") and thus not very useful.

My second thought is that I could potentially make use of a closely related genome that is KEGG annotated and thus listed in https://www.genome.jp/kegg/catalog/org_list.html. However, I don't have the gene mapping between the reference and my sequences.

For example, when I try:

gse_kegg <- gseKEGG(
geneList = geneList,
organism = 'mfol',
key = 'kegg',
minGSSize    = 120,
pvalueCutoff = 0.05,
verbose      = FALSE)


I obviously get an error: Expected input gene ID: ,DXT68_06835,DXT68_14490, because my gene names are not mapped to the KEGG names of the reference.

Does anybody have an idea, how I can use the organism mfol with my input genes? How can I map my genes to e.g. gene ID: DXT68_15070?

Any help would be greatly appreciated.

clusterProfiler KEGG GeneSetEnrichment • 76 views