clusterProfiler: Which GO version is used behind ?
1
1
Entering edit mode
ZheFrench ▴ 60
@zhefrench-11689
Last seen 3 months ago
France

I'm using clusterProfiler v3.2.14 for GO molecular function in human. 

The background size is 16309.

Panther webtool give a size of 21002. Annotation Version and Release Date: GO Ontology database Released 2017-08-14

I was wondering which version of the GO database is use in ClusterProfiler. Why these numbers are differents ?

I'm using something like.

edb = useMart("ENSEMBL_MART_ENSEMBL", dataset="hsapiens_gene_ensembl",host="jul2016.archive.ensembl.org")

gene_infos = getBM(attributes=c('ensembl_gene_id','hgnc_symbol','gene_biotype','chromosome_name','start_position','end_position','strand','entrezgene'),values=data[,opt$column],filters='ensembl_gene_id',mart=edb)

# SET INPUT LIST
entrez_id  <- gene_infos$entrezgene
ensembl_id <- gene_infos$ensembl_gene_id

go_mf <- enrichGO(gene=entrez_id,OrgDb = org.Hs.eg.db,ont = "MF",pvalueCutoff = 0.01, pAdjustMethod = "BH", qvalueCutoff = 0.05, readable = TRUE)

 

 


 

 
clusterProfiler • 4.4k views
ADD COMMENT
3
Entering edit mode
Guido Hooiveld ★ 4.1k
@guido-hooiveld-2020
Last seen 4 hours ago
Wageningen University, Wageningen, the …

AFAIK clusterProfiler uses under the hood the GO information available in the library GO.db. This GO annotation database is updated twice a year before each new Bioconductor release. Assuming you are using the latest Bioconductor release (i.e. 3.5), then the GO data was collected on 29 March 2017 (GO.db version 3.4.1).

> library(clusterProfiler)
> library(GO.db)

> GO.db
GODb object:
| GOSOURCENAME: Gene Ontology
| GOSOURCEURL: ftp://ftp.geneontology.org/pub/go/godatabase/archive/latest-lite/
| GOSOURCEDATE: 2017-Mar29
| Db type: GODb
| package: AnnotationDbi
| DBSCHEMA: GO_DB
| GOEGSOURCEDATE: 2017-Mar29
| GOEGSOURCENAME: Entrez Gene
| GOEGSOURCEURL: ftp://ftp.ncbi.nlm.nih.gov/gene/DATA
| DBSCHEMAVERSION: 2.1

Please see: help('select') for usage information
>

 

> sessionInfo()
R version 3.4.1 Patched (2017-08-27 r73149)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252
[2] LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets
[8] methods   base     

other attached packages:
[1] GO.db_3.4.1           AnnotationDbi_1.38.2  IRanges_2.10.3       
[4] S4Vectors_0.14.3      Biobase_2.36.2        BiocGenerics_0.22.0  
[7] clusterProfiler_3.4.4 DOSE_3.2.0           

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.12        compiler_3.4.1      plyr_1.8.4         
 [4] tools_3.4.1         digest_0.6.12       bit_1.1-12         
 [7] RSQLite_2.0         memoise_1.1.0       tibble_1.3.4       
[10] gtable_0.2.0        pkgconfig_2.0.1     rlang_0.1.2        
[13] fastmatch_1.1-0     igraph_1.1.2        DBI_0.7            
[16] rvcheck_0.0.9       fgsea_1.2.1         gridExtra_2.2.1    
[19] stringr_1.2.0       bit64_0.9-7         grid_3.4.1         
[22] glue_1.1.1          qvalue_2.8.0        data.table_1.10.4  
[25] BiocParallel_1.10.1 GOSemSim_2.2.0      purrr_0.2.3        
[28] tidyr_0.7.1         ggplot2_2.2.1       DO.db_2.9          
[31] reshape2_1.4.2      blob_1.1.0          magrittr_1.5       
[34] splines_3.4.1       scales_0.5.0        colorspace_1.3-2   
[37] stringi_1.1.5       lazyeval_0.2.0      munsell_0.4.3      
>

 

 

ADD COMMENT
0
Entering edit mode

By the way ,same question for Reactome ? and Kegg ? I think enrichKEGG use lastest remote version using use_internal_data=FALSE.

Is there a way to upgrade these anotations  without upgrading R bioconductor. I'm stuck to R 3.3.1 and can't upgrage bioconductor withtout reinstalling a more recent version of R.

 

ADD REPLY
0
Entering edit mode

You can always install whatever version of package you want - that's the beauty of R and Open Source software in general.

However, do note that we don't support anything but the release version of R/BioC, which means if you are running some non-standard configuration and you have problems, it's on you to fix. If you post a question here with a sessionInfo output that indicates you are mixing and matching, the first response will be to tell you to install the latest version of R/BioC.

 

ADD REPLY

Login before adding your answer.

Traffic: 1028 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6