I am trying to use pcaExplorer's pca2go()
function for functional enrichment analysis on genes with the highest principal component loadings. The function fails with the following error:
Ranking genes by the loadings ...
Ranking genes by the loadings ... done!
Extracting functional categories enriched in the gene subsets ...
Building most specific GOs .....
Error in result_create(conn@ptr, statement) : no such table: go_bp
The problem is due to the fact that I am using a custom OrgDb which doesn't have the go_bp
table. A custom OrgDb for a non-model organism that I made using AnnotationForge::makeOrgPackageFromNCBI()
has these tables:
tbls: accessions, alias, entrez_genes, gene_info, genes, go, go_all,
map_counts, map_metadata, metadata, pubmed, refseq
In particular, it does not include go_bp
, which explains the error in pca2go
. In contrast, the standard org.Hs.eg.db
includes these tables (among many others):
go, go_all, go_bp, go_bp_all, go_cc, go_cc_all, go_mf, go_mf_all
My questions:
1) What is the difference between go
and go_bp
tables? My understanding is that we are encouraged to use select()
and columns()
and keytypes()
, but using a non-model organism and the error message from pca2go
about a particular missing table is leading me down the rabbit hole (I am a bioconductor novice) of wanting to understand the tables.
2) And more practically, how can I create a custom OrgDb package which contains go_bp
?
Hi James and @psutton, here I am.
In the specific case of
pca2go
I am calling an underlying routine that is based on the good oldtopGO
package. That package, in turn, expects to provide one of the three main ontologies to compute the enriched functions.I might be wrong, but there is currently no workaround for that (please James do correct me if that is not the case, and if I can indeed do change something in my implementation for
pca2go
).Alternative, but yet it might deliver more generic terms, would be to use
limmaquickpca2go
, which useslimma::goana
- that might avoid the issue you have?Federico
You could talk to Adrian Alexa about patching topGO to use the
go
table when thego_bp
table doesn't exist.Will do, thanks.
Federico
I was able to modify
AnnotationForge
so that it would creatego_bp, go_cc, go_mf, go_bp_all, go_cc_all
, andgo_mf_all
tables for me. However, it is a bit hacky, because there were a number of issues with the exact spelling and case of field names in the SQLite tables, which I partly describe here: https://support.bioconductor.org/p/118859/ .Anyhow, I now have
pca2go
/topGO
working for my custom OrgDb, but it is not a very general solution, so it would still be useful to other users to have a patch to use thego
table whengo_bp
doesn't exist.Hi Federico,
I tried
limmaquick2pca2go
, but it failed with an error message:and looking at the
goana.default
code where it fails, it is looking for an org.<species>.egGOALLEGS file:Human has
org.Hs.egGO2ALLEGS
, but the non-model organism OrgDb that I made usingAnnotationForge::makeOrgPackageFromNCBI()
does not. I don't know how to make such a mapping and how to include it in a custom OrgDb package.Sorry for the confusion in pointing towards
limmaquick2pca2go
.I checked now and noticed
goana
has built in support for a few species, but custom annotation packages can still be provided. The problem is then once you'd need theegGO2ALLEGS
, and it can be that if it is built with that method, that specific table is not built.