Hi! I'm running the function 'kegga' to perform a pathyway analysis on my olive tree data (a species for which a default package doesn't exist), but I get this error...
I think the problem may be my Gene names, which aren't normal NCBI Gene IDs...
You need to map those to NCBI Gene IDs, which is what KeGG uses. A quick Google search of a few of your gene names didn't bring anything up, so presumably you know what those IDs are, and can figure out how to map them. If not, you need to find someone who does, which would probably be somewhere other than here.
Hi, thanks for your answer!
Yes, I realised later that the answer could be using the NCBI codes of the genome mine was annotated on (wild olive tree, while mine is the domestic olive tree) ... so it looks as if all I need is to merge some data to obtain the NCBI codes before, and then pass such new codes to kegga so it can process them as they were from wild olive!
You don't explain which genome build or what annotation you have used, so I am going to make some guesses.
I am guessing that you are using the Oe6 de novo olive genome build http://denovo.cnag.cat/olive published by the Spanish CNAG team just a couple of years ago.
I am also guessing that you have used the Oe6 GFF3 gene annotation file from the same site, which provides gene Ids using their own system.
Since CNAG's genome build is so recent, and since they use their own gene Ids unique to the Oe6 project, there is no way that you can expect Bioconductor or KEGG to know what the gene Ids mean. You need to obtain gene annotation information from CNAG's own website. If they don't cross reference their own Gene Ids to NCBI or Ensembl or GO then no one else will be able to do so.
In my opinion, you should make sure that you know what the Gene Ids themselves mean before you try to do any functional analysis such as GO or KEGG. CNAG provides a directory of functional annotation files, and these files are presumably what you need to use. You have to explore the files for yourself and contact the CNAG people if you need help.
Hello, first of all, thanks for your kind and detailed answer!
Yes, you guessed it right, I used the domestic olive tree genomic annotation from the Spanish team you mentioned!
I realised later that there is, indeed, a way to relate such genes to the wild olive tree ones, and it is contained in the annotations files made available by the team.
Most of the genes were annotated on the basis of a BLAST analysis against wild olive tree, so it's relatively easy to retrieve the code of the corresponding wild olive tree genes they were annotated on.
This way, I possibly can change the kegga input file a little and let the function use the oeu: codes rather than the OE6A which were adopted by the team only. And use kegga normally with species="oeu".
The matches may look as follows (just an example with a couple of genes).
There are many matches for each OE6A gene, but I guess I'll stick to the one showing the highest BLAST score...
Hi, thanks for your answer! Yes, I realised later that the answer could be using the NCBI codes of the genome mine was annotated on (wild olive tree, while mine is the domestic olive tree) ... so it looks as if all I need is to merge some data to obtain the NCBI codes before, and then pass such new codes to kegga so it can process them as they were from wild olive!