Hi everyone,
I am new to R and I want to use clusterProfiler for GO Term enrichment analysis. I am working with bacteria, so I need to build my own GO mapping and tried to do that by using the buildGOmap() function. I get my gene to GO mapping from PseudoCAP. I downloaded it, and select only the GO accession number and the KEGG ID.
When I feed this to the buildGOmap function, I get the error:
Error in names(object) <- nm : 'names' attribute [15883] must be the same length as the vector [0]
I have been getting an error that I am unable to fix, even after a couple of days of reading and trying. Did anyone have a similar problem? What am I doing wrong?
Thanks!
> Pa_GO <- read.csv("~/Experiments/20230724-Biofilm-Assay082-Proteomics/AnalysisR/gene_ontology_csv.csv")
> Pa_GOterms <- Pa_GO[c(5,1)]
> head(Pa_GOterms)
Accession Locus.Tag
1 GO:0005524 PA0001
2 GO:0006270 PA0001
3 GO:0006275 PA0001
4 GO:0016887 PA0001
5 GO:0016887 PA0001
6 GO:0006260 PA0001
> Pa_GOMap <- buildGOmap(Pa_GOterms)
Error in names(object) <- nm :
'names' attribute [15883] must be the same length as the vector [0]
> sessionInfo()
R version 4.3.2 (2023-10-31 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 11 x64 (build 22631)
Matrix products: default
locale:
[1] LC_COLLATE=English_Germany.utf8 LC_CTYPE=English_Germany.utf8 LC_MONETARY=English_Germany.utf8
[4] LC_NUMERIC=C LC_TIME=English_Germany.utf8
time zone: Europe/Berlin
tzcode source: internal
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] enrichplot_1.22.0 readxl_1.4.3 clusterProfiler_4.10.0 scales_1.3.0 ggforce_0.4.1
[6] ggplot2_3.4.4
loaded via a namespace (and not attached):
[1] RColorBrewer_1.1-3 rstudioapi_0.15.0 jsonlite_1.8.8 magrittr_2.0.3
[5] farver_2.1.1 fs_1.6.3 zlibbioc_1.48.0 vctrs_0.6.5
[9] memoise_2.0.1 RCurl_1.98-1.14 ggtree_3.10.0 htmltools_0.5.7
[13] AnnotationHub_3.10.0 curl_5.2.0 cellranger_1.1.0 gridGraphics_0.5-1
[17] plyr_1.8.9 cachem_1.0.8 igraph_2.0.1.1 mime_0.12
[21] lifecycle_1.0.4 pkgconfig_2.0.3 Matrix_1.6-1.1 R6_2.5.1
[25] fastmap_1.1.1 gson_0.1.0 GenomeInfoDbData_1.2.11 shiny_1.8.0
[29] digest_0.6.34 aplot_0.2.2 colorspace_2.1-0 patchwork_1.2.0
[33] AnnotationDbi_1.64.1 S4Vectors_0.40.2 RSQLite_2.3.5 filelock_1.0.3
[37] fansi_1.0.6 httr_1.4.7 polyclip_1.10-6 compiler_4.3.2
[41] remotes_2.4.2.1 bit64_4.0.5 withr_3.0.0 BiocParallel_1.36.0
[45] viridis_0.6.5 DBI_1.2.1 MASS_7.3-60 rappdirs_0.3.3
[49] HDO.db_0.99.1 tools_4.3.2 ape_5.7-1 scatterpie_0.2.1
[53] interactiveDisplayBase_1.40.0 httpuv_1.6.14 glue_1.7.0 promises_1.2.1
[57] nlme_3.1-163 GOSemSim_2.28.1 gridtext_0.1.5 grid_4.3.2
[61] shadowtext_0.1.3 reshape2_1.4.4 fgsea_1.28.0 generics_0.1.3
[65] gtable_0.3.4 tidyr_1.3.1 data.table_1.15.0 tidygraph_1.3.1
[69] xml2_1.3.6 utf8_1.2.4 XVector_0.42.0 BiocGenerics_0.48.1
[73] ggrepel_0.9.5 BiocVersion_3.18.1 pillar_1.9.0 stringr_1.5.1
[77] yulab.utils_0.1.4 later_1.3.2 splines_4.3.2 dplyr_1.1.4
[81] ggtext_0.1.2 tweenr_2.0.2 BiocFileCache_2.10.1 treeio_1.26.0
[85] lattice_0.21-9 bit_4.0.5 tidyselect_1.2.0 GO.db_3.18.0
[89] Biostrings_2.70.2 gridExtra_2.3 IRanges_2.36.0 stats4_4.3.2
[93] graphlayouts_1.1.0 Biobase_2.62.0 stringi_1.8.3 lazyeval_0.2.2
[97] ggfun_0.1.4 yaml_2.3.8 codetools_0.2-19 ggraph_2.1.0
[101] tibble_3.2.1 qvalue_2.34.0 BiocManager_1.30.22 ggplotify_0.1.2
[105] cli_3.6.2 xtable_1.8-4 munsell_0.5.0 Rcpp_1.0.12
[109] GenomeInfoDb_1.38.5 dbplyr_2.4.0 png_0.1-8 parallel_4.3.2
[113] ellipsis_0.3.2 blob_1.2.4 DOSE_3.28.2 bitops_1.0-7
[117] viridisLite_0.4.2 tidytree_0.4.6 purrr_1.0.2 crayon_1.5.2
[121] rlang_1.1.3 cowplot_1.1.3 fastmatch_1.1-4 KEGGREST_1.42.0
Thank you so much for your help and for contacting Guangchuang Yu! It solve the issue for the buildGOmap function.
Curiously, I wanted to feed the result to the enrich function and it wasn't able to map the genes on the IDs. It only worked if I flip the rows in the Pa_GOMap, making it a true "TERM2GENE" ("GO" in column 1 and "Gene ID" in column 2)
Thanks again!