Function kegga not working for species other than the conventional ones - Pathways do not overlap with universe
2
0
Entering edit mode
Raito92 ▴ 60
@raito92-20399
Last seen 22 months ago
Italy

Hi! I'm running the function 'kegga' to perform a pathyway analysis on my olive tree data (a species for which a default package doesn't exist), but I get this error...

enter image description here

I think the problem may be my Gene names, which aren't normal NCBI Gene IDs...

https://i.ibb.co/yBWCC2q/Gene-Names.png

Any suggestions? Thanks in advance!

If needed, here is my sessionInfo() -> https://i.ibb.co/tLkSqmH/session-Info.png

kegga limma pathway analysis kegg • 1.6k views
ADD COMMENT
3
Entering edit mode
@james-w-macdonald-5106
Last seen 17 hours ago
United States

You need to map those to NCBI Gene IDs, which is what KeGG uses. A quick Google search of a few of your gene names didn't bring anything up, so presumably you know what those IDs are, and can figure out how to map them. If not, you need to find someone who does, which would probably be somewhere other than here.

ADD COMMENT
0
Entering edit mode

Hi, thanks for your answer! Yes, I realised later that the answer could be using the NCBI codes of the genome mine was annotated on (wild olive tree, while mine is the domestic olive tree) ... so it looks as if all I need is to merge some data to obtain the NCBI codes before, and then pass such new codes to kegga so it can process them as they were from wild olive!

ADD REPLY
2
Entering edit mode
@gordon-smyth
Last seen 56 minutes ago
WEHI, Melbourne, Australia

You don't explain which genome build or what annotation you have used, so I am going to make some guesses.

I am guessing that you are using the Oe6 de novo olive genome build http://denovo.cnag.cat/olive published by the Spanish CNAG team just a couple of years ago.

I am also guessing that you have used the Oe6 GFF3 gene annotation file from the same site, which provides gene Ids using their own system.

Since CNAG's genome build is so recent, and since they use their own gene Ids unique to the Oe6 project, there is no way that you can expect Bioconductor or KEGG to know what the gene Ids mean. You need to obtain gene annotation information from CNAG's own website. If they don't cross reference their own Gene Ids to NCBI or Ensembl or GO then no one else will be able to do so.

In my opinion, you should make sure that you know what the Gene Ids themselves mean before you try to do any functional analysis such as GO or KEGG. CNAG provides a directory of functional annotation files, and these files are presumably what you need to use. You have to explore the files for yourself and contact the CNAG people if you need help.

ADD COMMENT
0
Entering edit mode

Hello, first of all, thanks for your kind and detailed answer! Yes, you guessed it right, I used the domestic olive tree genomic annotation from the Spanish team you mentioned!

I realised later that there is, indeed, a way to relate such genes to the wild olive tree ones, and it is contained in the annotations files made available by the team. Most of the genes were annotated on the basis of a BLAST analysis against wild olive tree, so it's relatively easy to retrieve the code of the corresponding wild olive tree genes they were annotated on.

This way, I possibly can change the kegga input file a little and let the function use the oeu: codes rather than the OE6A which were adopted by the team only. And use kegga normally with species="oeu". The matches may look as follows (just an example with a couple of genes).

enter image description here

There are many matches for each OE6A gene, but I guess I'll stick to the one showing the highest BLAST score...

I hope it will work, thank you a lot!

ADD REPLY

Login before adding your answer.

Traffic: 449 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6