Dear Guangchuang,
I would really like to use your GOSemSim package to calculate similarities between GO Terms for several custom genome annotations I work with (parasitic nematodes, butterflies, etc). It sounded possible, just use AnnotationForge to build an org.Xxx.eg.db package, and then it should work. But, it seems like GOSemSim expects the ENTREZID to be the central ID for all the annotation, while AnnotationForge, to keep things generic, uses something called GID. It seems like even adding an ENTREZID as an extra field does not get GOSemSim to work.
I have described the problem in another post, thinking perhaps the problem was with AnnotationForge: [Question: AnnotationForge not working for building custom org packages][1]
A quick recap of a worked example that highlights the problem: build the example package for makeOrgPackage, install, then try to use godata() on the new package and it fails. Code follows:
library(AnnotationForge)
example(makeOrgPackage)
install.packages("./org.Tguttata.eg.db",type = "source", repos = NULL)
library(org.Tguttata.eg.db)
library(GOSemSim)
tgGO <- godata('org.Tguttata.eg.db', ont="BP")
Error in testForValidKeytype(x, keytype) : Invalid keytype: ENTREZID. Please use the keytypes method to see a listing of valid arguments.
Since I seem to have reached a dead end, I'm opening a new question to you. Hopefully there might be a way of accepting a user specified key instead of ENTREZID?
Many thanks,
Cei
sessionInfo()
R version 3.6.2 (2019-12-12)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS Catalina 10.15.4
Matrix products: default
BLAS: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats4 parallel stats graphics grDevices utils datasets methods base
other attached packages:
[1] GOSemSim_2.12.1 org.Tguttata.eg.db_0.1 AnnotationForge_1.28.0 AnnotationDbi_1.48.0 IRanges_2.20.2
[6] S4Vectors_0.24.4 Biobase_2.46.0 BiocGenerics_0.32.0
loaded via a namespace (and not attached):
[1] Rcpp_1.0.4 GO.db_3.10.0 XML_3.99-0.3 digest_0.6.25 bitops_1.0-6 DBI_1.1.0 RSQLite_2.2.0
[8] rlang_0.4.5 blob_1.2.1 vctrs_0.2.4 tools_3.6.2 bit64_0.9-7 RCurl_1.98-1.2 bit_1.1-15.2
[15] compiler_3.6.2 pkgconfig_2.0.3 memoise_1.1.0
[1]: https://support.bioconductor.org/p/130160/