I'm trying to do pathway analysis for RNA-seq data (KEGG, GSEA, GO). My organism is Bacteroides thetaiotaomicron and I'm having trouble finding an annotation database. AnnotationHub has one library (Inparanoid8Db) that from what I can find is old and not useable with packages for pathway analysis:
> library(AnnotationHub) > hub <- AnnotationHub() snapshotDate(): 2020-09-03 > query(hub, "Bacteroides thetaiotaomicron") > AnnotationHub with 1 record > # snapshotDate(): 2020-09-03 > # names(): AH10465 > # $dataprovider: Inparanoid8 > # $species: Bacteroides thetaiotaomicron > # $rdataclass: Inparanoid8Db > # $rdatadateadded: 2014-03-31 > # $title: hom.Bacteroides_thetaiotaomicron.inp8.sqlite > # $description: Inparanoid 8 annotations about Bacteroides thetaiotaomicron > # $taxonomyid: 226186 > # $genome: inparanoid8 genomes > # $sourcetype: Inparanoid > # $sourceurl: http://inparanoid.sbc.su.se/download/current/Orthologs/B.thetaiotaomicron > # $sourcesize: NA > # $tags: c("Inparanoid", "Gene", "Homology", "Annotation") > # retrieve record with 'object[["AH10465"]]'
I also tried creating a library using AnnotationForge, which gave me files containing seemingly all organisms and produced an error message:
> makeOrgPackageFromNCBI(version = "0.1", + author = "Some One <firstname.lastname@example.org>", + maintainer = "Some One <email@example.com>"", + outputDir = getwd(), + tax_id = "226186", + genus = "Bacteroides", + species = "thetaiotaomicron") Error in prepareDataFromNCBI(tax_id, NCBIFilesDir, outputDir, rebuildCache, : no information found for species with tax id 226186
Another problem I have is that the NCBI entry for Bacteroides thetaiotaomicron has been "re-annotated". My experimental data identifies genes by their new locus tags, but all the pathway analysis packages require KEGG, ncbi-geneid, ncbi-proteinid or uniprot input IDs. I don't know how to convert the IDs because conversion software does not use the new locus tags.
How can I get around these issues to download or create an annotation database? And what can I do about the mismatch between IDs? Thanks!