I am trying to use pcaExplorer's pca2go() function for functional enrichment analysis on genes with the highest principal component loadings. The function fails with the following error:
Ranking genes by the loadings ...
Ranking genes by the loadings ... done!
Extracting functional categories enriched in the gene subsets ...
Building most specific GOs .....
Error in result_create(conn@ptr, statement) : no such table: go_bp
The problem is due to the fact that I am using a custom OrgDb which doesn't have the go_bp table. A custom OrgDb for a non-model organism that I made using AnnotationForge::makeOrgPackageFromNCBI() has these tables:
tbls: accessions, alias, entrez_genes, gene_info, genes, go, go_all,
map_counts, map_metadata, metadata, pubmed, refseq
In particular, it does not include go_bp, which explains the error in pca2go. In contrast, the standard org.Hs.eg.db includes these tables (among many others):
go, go_all, go_bp, go_bp_all, go_cc, go_cc_all, go_mf, go_mf_all
My questions:
1) What is the difference between go and go_bp tables? My understanding is that we are encouraged to use select() and columns() and keytypes(), but using a non-model organism and the error message from pca2go about a particular missing table is leading me down the rabbit hole (I am a bioconductor novice) of wanting to understand the tables.
2) And more practically, how can I create a custom OrgDb package which contains go_bp?

Hi James and @psutton, here I am.
In the specific case of
pca2goI am calling an underlying routine that is based on the good oldtopGOpackage. That package, in turn, expects to provide one of the three main ontologies to compute the enriched functions.I might be wrong, but there is currently no workaround for that (please James do correct me if that is not the case, and if I can indeed do change something in my implementation for
pca2go).Alternative, but yet it might deliver more generic terms, would be to use
limmaquickpca2go, which useslimma::goana- that might avoid the issue you have?Federico
You could talk to Adrian Alexa about patching topGO to use the
gotable when thego_bptable doesn't exist.Will do, thanks.
Federico
I was able to modify
AnnotationForgeso that it would creatego_bp, go_cc, go_mf, go_bp_all, go_cc_all, andgo_mf_alltables for me. However, it is a bit hacky, because there were a number of issues with the exact spelling and case of field names in the SQLite tables, which I partly describe here: https://support.bioconductor.org/p/118859/ .Anyhow, I now have
pca2go/topGOworking for my custom OrgDb, but it is not a very general solution, so it would still be useful to other users to have a patch to use thegotable whengo_bpdoesn't exist.Hi Federico,
I tried
limmaquick2pca2go, but it failed with an error message:and looking at the
goana.defaultcode where it fails, it is looking for an org.<species>.egGOALLEGS file:Human has
org.Hs.egGO2ALLEGS, but the non-model organism OrgDb that I made usingAnnotationForge::makeOrgPackageFromNCBI()does not. I don't know how to make such a mapping and how to include it in a custom OrgDb package.Sorry for the confusion in pointing towards
limmaquick2pca2go.I checked now and noticed
goanahas built in support for a few species, but custom annotation packages can still be provided. The problem is then once you'd need theegGO2ALLEGS, and it can be that if it is built with that method, that specific table is not built.