KEGG enrichment analysis for non model organism
1
0
Entering edit mode
@2f24e23b
Last seen 5 weeks ago

I am working with a non model organism called Tenualosa ilisa. I am working with RNA-seq data. De novo assembly was performed using trinity. Foe kegg enrichment analysis, at first blastx was performed against kegg database. From blast result I found a output file where a column contain KO id. I also perform Differential expression of the transcripts. so I have two files one containing transcript id and KO id, and other is differentially expressed transcript result file. with these how can i go forward for kegg enrichment analysis? Can anyone please provide me the complete pipeline with coding and step by step process to do this?

RNAseq123 • 362 views
0
Entering edit mode

Some comments: It looks you have a similar challenge as was recently posted here: Result from GSEA for non-model organism not as expected. Please check that thread completely.

All gene set analyses, either the overrepresentation (ORA) or gene set enrichment analysis (GSEA) variant, assume you use as input a (ranked) gene list, not of transcripts.

It is also expected that all input IDs are unique. Again, see the thread referred to above.

Hope this helps!

0
Entering edit mode

As I can understand ClusterProfiler requires a ordb but as i workimg with non model organism there is not available any database for this. What should I do in this case?

0
Entering edit mode

No, then you misunderstood: clusterProfiler does NOT necessarily need an OrgDb!

It basically needs to 2 inputs: a data.frame for argument TERM2GENE, and a data.frame for the argument TERM2NAME. These are then used for the generic ORA function enricher, or the generic GSEA function GSEA. This is what is highlighted in the recent thread I referred to above, in which (also?) KO/ko ids are used.

The functions enrichKEGG and gseKEGG are rather convenience functions that allow to easy perform a KEGG-based ORA or GSEA analysis, respectively, for organisms for which an OrgDb is available.

You may also want to check the section on 'Universal enrichment analysis' here, or my post here: what the test method for enrichGO in clusterProfiler?.

0
Entering edit mode

Thanks a lot for you reply! can you please tell me about tje format of these two data.frame. I want know what are these two files. actually I am a beginner in this field so I am facing so many problem about this.

0
Entering edit mode

I have the following two files. File 1: from blastx result against kegg database swissprot_id blasx_hit Q86UKO K10093 P4I17O K06453 File 2 is the DEG results. what is the nest step I should do to perform kegg enrichment analysis?

1
Entering edit mode
@james-w-macdonald-5106
Last seen 3 hours ago
United States

If you look at the KEGG species list, you won't find your species listed there. Which will make it difficult to do anything. You need some sort of mappings from the IDs you have in hand to the KEGG or GO identifiers. If you can come up with that sort of mapping, you can always use the kegga function in limma, which allows you to provide a Gene/ID mapping data.frame. Unfortunately, working with non-model organisms can be difficult.