GSEA using clusterProfiler with non-model organism.
0
1
Entering edit mode
makrez ▴ 10
@81dca918
Last seen 3.1 years ago
Switzerland

I have RNA-Seq data from a prokaryotic non-model organism (Microbacterium) and I am doing a gene set enrichment analysis. I mapped my amino acid sequences to KO annotations first. I then managed to do the gene set enrichment analysis by using organism = 'ko'.

gse_kegg <- gseKEGG(
  geneList = geneList,
  organism = 'ko',
  minGSSize    = 120,
  pvalueCutoff = 0.05,
  verbose      = FALSE)

The output is somewhat unspecific (e.g. upregulated is "biosynthesis of secondary metabolites") and thus not very useful.

My second thought is that I could potentially make use of a closely related genome that is KEGG annotated and thus listed in https://www.genome.jp/kegg/catalog/org_list.html. However, I don't have the gene mapping between the reference and my sequences.

For example, when I try:

gse_kegg <- gseKEGG(
  geneList = geneList,
  organism = 'mfol',
  key = 'kegg',
  minGSSize    = 120,
  pvalueCutoff = 0.05,
  verbose      = FALSE)

I obviously get an error: Expected input gene ID: ,DXT68_06835,DXT68_14490, because my gene names are not mapped to the KEGG names of the reference.

Does anybody have an idea, how I can use the organism mfol with my input genes? How can I map my genes to e.g. gene ID: DXT68_15070?

Any help would be greatly appreciated.

clusterProfiler KEGG GeneSetEnrichment • 1.7k views
ADD COMMENT

Login before adding your answer.

Traffic: 655 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6