Question: clusterProfiler: Which GO version is used behind ?
gravatar for ZheFrench
12 months ago by
ZheFrench10 wrote:

I'm using clusterProfiler v3.2.14 for GO molecular function in human. 

The background size is 16309.

Panther webtool give a size of 21002. Annotation Version and Release Date: GO Ontology database Released 2017-08-14

I was wondering which version of the GO database is use in ClusterProfiler. Why these numbers are differents ?

I'm using something like.

edb = useMart("ENSEMBL_MART_ENSEMBL", dataset="hsapiens_gene_ensembl",host="")

gene_infos = getBM(attributes=c('ensembl_gene_id','hgnc_symbol','gene_biotype','chromosome_name','start_position','end_position','strand','entrezgene'),values=data[,opt$column],filters='ensembl_gene_id',mart=edb)

entrez_id  <- gene_infos$entrezgene
ensembl_id <- gene_infos$ensembl_gene_id

go_mf <- enrichGO(gene=entrez_id,OrgDb =,ont = "MF",pvalueCutoff = 0.01, pAdjustMethod = "BH", qvalueCutoff = 0.05, readable = TRUE)




ADD COMMENTlink modified 12 months ago by Guido Hooiveld2.3k • written 12 months ago by ZheFrench10
gravatar for Guido Hooiveld
12 months ago by
Guido Hooiveld2.3k
Wageningen University, Wageningen, the Netherlands
Guido Hooiveld2.3k wrote:

AFAIK clusterProfiler uses under the hood the GO information available in the library GO.db. This GO annotation database is updated twice a year before each new Bioconductor release. Assuming you are using the latest Bioconductor release (i.e. 3.5), then the GO data was collected on 29 March 2017 (GO.db version 3.4.1).

> library(clusterProfiler)
> library(GO.db)

> GO.db
GODb object:
| GOSOURCENAME: Gene Ontology
| GOSOURCEDATE: 2017-Mar29
| Db type: GODb
| package: AnnotationDbi

Please see: help('select') for usage information


> sessionInfo()
R version 3.4.1 Patched (2017-08-27 r73149)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1

Matrix products: default

[1] LC_COLLATE=English_United States.1252
[2] LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets
[8] methods   base     

other attached packages:
[1] GO.db_3.4.1           AnnotationDbi_1.38.2  IRanges_2.10.3       
[4] S4Vectors_0.14.3      Biobase_2.36.2        BiocGenerics_0.22.0  
[7] clusterProfiler_3.4.4 DOSE_3.2.0           

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.12        compiler_3.4.1      plyr_1.8.4         
 [4] tools_3.4.1         digest_0.6.12       bit_1.1-12         
 [7] RSQLite_2.0         memoise_1.1.0       tibble_1.3.4       
[10] gtable_0.2.0        pkgconfig_2.0.1     rlang_0.1.2        
[13] fastmatch_1.1-0     igraph_1.1.2        DBI_0.7            
[16] rvcheck_0.0.9       fgsea_1.2.1         gridExtra_2.2.1    
[19] stringr_1.2.0       bit64_0.9-7         grid_3.4.1         
[22] glue_1.1.1          qvalue_2.8.0        data.table_1.10.4  
[25] BiocParallel_1.10.1 GOSemSim_2.2.0      purrr_0.2.3        
[28] tidyr_0.7.1         ggplot2_2.2.1       DO.db_2.9          
[31] reshape2_1.4.2      blob_1.1.0          magrittr_1.5       
[34] splines_3.4.1       scales_0.5.0        colorspace_1.3-2   
[37] stringi_1.1.5       lazyeval_0.2.0      munsell_0.4.3      



ADD COMMENTlink modified 12 months ago • written 12 months ago by Guido Hooiveld2.3k

By the way ,same question for Reactome ? and Kegg ? I think enrichKEGG use lastest remote version using use_internal_data=FALSE.

Is there a way to upgrade these anotations  without upgrading R bioconductor. I'm stuck to R 3.3.1 and can't upgrage bioconductor withtout reinstalling a more recent version of R.


ADD REPLYlink written 12 months ago by ZheFrench10

You can always install whatever version of package you want - that's the beauty of R and Open Source software in general.

However, do note that we don't support anything but the release version of R/BioC, which means if you are running some non-standard configuration and you have problems, it's on you to fix. If you post a question here with a sessionInfo output that indicates you are mixing and matching, the first response will be to tell you to install the latest version of R/BioC.


ADD REPLYlink written 12 months ago by James W. MacDonald47k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 442 users visited in the last hour