Function simplify fails when argument ont is set to "ALL" in enrichGO function in package clusterProfiler
Entering edit mode
Last seen 5.1 years ago

Dear Bioconductor users and package maintainer Guangchuang,

I was running a GO enrichment analysis on a set of DEGs using enrichGO function and simplify function from  the R/Bioconductor package clusterProfiler and got an error message when the argument "ALL" was used in enrichGO. The simplify function works fine if the individual GO subcategory ("BP", "MF", or "CC") was used . I tried to reproduce the error with a builtin data set (gsSample) and got the same error (please see below). I was wondering if anyone has the same experience or I did anything wrong. Thanks for the help!




> library(clusterProfiler)
Loading required package: DOSE

DOSE v3.4.0  For help:

If you use DOSE in published research, please cite:
Guangchuang Yu, Li-Gen Wang, Guang-Rong Yan, Qing-Yu He. DOSE: an R/Bioconductor package for Disease Ontology Semantic and Enrichment analysis. Bioinformatics 2015, 31(4):608-609

clusterProfiler v3.6.0  For help:

If you use clusterProfiler in published research, please cite:
Guangchuang Yu., Li-Gen Wang, Yanyan Han, Qing-Yu He. clusterProfiler: an R package for comparing biological themes among gene clusters. OMICS: A Journal of Integrative Biology. 2012, 16(5):284-287.
> library(
Loading required package: AnnotationDbi
Loading required package: stats4
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: ‘BiocGenerics’

The following objects are masked from ‘package:parallel’:

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked from ‘package:stats’:

    IQR, mad, sd, var, xtabs

The following objects are masked from ‘package:base’:

    anyDuplicated, append,, cbind, colMeans, colnames,
    colSums,, duplicated, eval, evalq, Filter, Find, get, grep,
    grepl, intersect, is.unsorted, lapply, lengths, Map, mapply, match,
    mget, order, paste, pmax,, pmin,, Position, rank,
    rbind, Reduce, rowMeans, rownames, rowSums, sapply, setdiff, sort,
    table, tapply, union, unique, unsplit, which, which.max, which.min

Loading required package: Biobase
Welcome to Bioconductor

    Vignettes contain introductory material; view with
    'browseVignettes()'. To cite Bioconductor, see
    'citation("Biobase")', and for packages 'citation("pkgname")'.

Loading required package: IRanges
Loading required package: S4Vectors

Attaching package: ‘S4Vectors’

The following object is masked from ‘package:base’:


> data(gcSample)
> # use only BP
> x <- enrichGO(gcSample[[2]], ont = 'BP', OrgDb = '')
> y <- simplify(x, measure = 'Wang', semData = NULL)
> # use ALL
> x <- enrichGO(gcSample[[2]], ont = 'ALL', OrgDb = '')
> y <- simplify(x, measure = 'Wang', semData = NULL)
Error in .local(x, ...) :
  simplify only applied to output from enrichGO...

> # session info
> sessionInfo()
R version 3.4.3 (2017-11-30)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS Sierra 10.12.6

Matrix products: default
BLAS: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib

[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets
[8] methods   base

other attached packages:
[1]    AnnotationDbi_1.40.0  IRanges_2.12.0
[4] S4Vectors_0.16.0      Biobase_2.38.0        BiocGenerics_0.24.0
[7] clusterProfiler_3.6.0 DOSE_3.4.0

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.14        plyr_1.8.4          compiler_3.4.3
 [4] tools_3.4.3         digest_0.6.13       bit_1.1-12
 [7] lubridate_1.7.1     RSQLite_2.0         memoise_1.1.0
[10] tibble_1.3.4        gtable_0.2.0        pkgconfig_2.0.1
[13] rlang_0.1.4         igraph_1.1.2        fastmatch_1.1-0
[16] DBI_0.7             rvcheck_0.0.9       gridExtra_2.3
[19] fgsea_1.4.0         stringr_1.2.0       tidyselect_0.2.3
[22] autoinst_0.0.0.9000 bit64_0.9-7         grid_3.4.3
[25] glue_1.2.0          qvalue_2.10.0       data.table_1.10.4-3
[28] BiocParallel_1.12.0 GOSemSim_2.4.0      rematch2_2.0.1
[31] purrr_0.2.4         tidyr_0.7.2         reshape2_1.4.3
[34] GO.db_3.5.0         DO.db_2.9           ggplot2_2.2.1
[37] blob_1.1.0          magrittr_1.5        splines_3.4.3
[40] scales_0.5.0        colorspace_1.3-2    stringi_1.1.6
[43] lazyeval_0.2.1      munsell_0.4.3

clusterProfiler GO GO enrichment genesetenrichment • 2.5k views
Entering edit mode
Guangchuang Yu ★ 1.2k
Last seen 3 months ago
China/Guangzhou/Southern Medical Univer…

It did check whether res@ont %in% c("MF", "BP", "CC") and not support 'ALL' currently.

Maybe will support it in future release.

Entering edit mode

FWIW this problem similarly "stumped" me for a little while.... I do hope it is resolved in the future... tx 

Entering edit mode

sorry, I post my question here, maybe not related to the topic. Now I'm doing enrichment using clusterprofiler and WebGestalR. Here, x <- unique(unlist(as.list(org.Bt.egGO2ALLEGS))), x is 5586. And clusterprofiler is based on GO.db. The following is using WebGestalR: enrichD_BP <- loadGeneSet(organism = "btaurus",enrichDatabase = "geneontology_Biological_Process_noRedundant") geneSet_BP <- enrichD_BP$geneSet length(unique(geneSet_BP$gene))#9011

enrichD_CC <- loadGeneSet(organism = "btaurus",enrichDatabase = "geneontology_Cellular_Component_noRedundant") geneSet_CC <- enrichD_CC$geneSet length(unique(geneSet_CC$gene))#6224

enrichD_MF <- loadGeneSet(organism = "btaurus",enrichDatabase = "geneontology_Molecular_Function_noRedundant") geneSet_MF <- enrichD_MF$geneSet length(unique(geneSet_MF$gene))#7960

we can see WebGestalR has more gene set than clusterprofiler. So which one is the most up to date and same with online GO database? But how to get the gene set from online GO database. This question has puzzled me these day. Becasue I think they two are powerful and should have the same results


Login before adding your answer.

Traffic: 221 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6