Hi,
so I am quite new to R but used clusterProfiler allready for KEGG enrichment analysis.
Now I would like to use it for GO enrichment analysis. Since the organism I am working witch is not supported I need to build my own GO mapping and tried to do that by usingthe buildGOmap() function. And there is my problem, which I am unable to fix, even after a couple of days of reading and trying.
Since I am working witch a diatom, I get my gene to GO mapping from EnsemblProtist. I downloaded it, modified it so there are only entrezgeneID in the list which have a GO accession number and saved it as a tab-deliminated text file. Than I load it into R and proceed with the code as follows:
library(clusterProfiler)
gomap<-read.delim("c:/........../gomapping.txt", colClasses="character")
head(gomap)
go_accession entrezgene
1 GO:0016020 7194683
2 GO:0016021 7194683
3 GO:0003735 7194684
4 GO:0005622 7194684
5 GO:0005840 7194684
6 GO:0006412 7194684
buildGOmap(gomap)
No error is actually produced. After the buildGO is excecuted it takes a long time and it gives out a long list in the console.
GO Gene
1 GO:0016020 7194683
2 GO:0016021 7194683
3 GO:0005575 7194683
.............
But its does nothing else. It does not produce the EG2ALLGO.rda ect. files in the working directory.
I tried many different stuff than just showed above in the code, but didnt make it work.
I think my problem is in how I give the data into the buildGOmap() function, but I just don't see the error.
It would be very nice if someone could help me and send me in the right direction
Thanks
Marcus
I use RStudio
> sessionInfo()
R version 3.3.2 (2016-10-31)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1
locale:
[1] LC_COLLATE=German_Germany.1252 LC_CTYPE=German_Germany.1252 LC_MONETARY=German_Germany.1252
[4] LC_NUMERIC=C LC_TIME=German_Germany.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] clusterProfiler_3.2.6 DOSE_3.0.9
loaded via a namespace (and not attached):
[1] igraph_1.0.1 Rcpp_0.12.8 AnnotationDbi_1.36.0 magrittr_1.5 splines_3.3.2
[6] BiocGenerics_0.20.0 IRanges_2.8.1 munsell_0.4.3 BiocParallel_1.8.1 colorspace_1.3-1
[11] fastmatch_1.0-4 stringr_1.1.0 plyr_1.8.4 tools_3.3.2 parallel_3.3.2
[16] grid_3.3.2 Biobase_2.34.0 data.table_1.9.6 gtable_0.2.0 DBI_0.5-1
[21] assertthat_0.1 lazyeval_0.2.0 tibble_1.2 GOSemSim_2.0.2 gridExtra_2.2.1
[26] tidyr_0.6.0 reshape2_1.4.2 DO.db_2.9 ggplot2_2.2.0 S4Vectors_0.12.0
[31] qvalue_2.6.0 RSQLite_1.0.0 stringi_1.1.2 fgsea_1.0.2 GO.db_3.4.0
[36] scales_0.4.1 stats4_3.3.2 chron_2.3-47
the buildGOmap -> enrichGO workflow had been changed to buildGOmap( optional and only for GO) -> enricher.
Unfortunately, the previous workflow is not supported anymore.
The vignette you mentioned was pretty old, the last version that support this workflow was in BioC 2.13, https://bioconductor.org/packages/2.13/bioc/html/clusterProfiler.html and it was removed since BioC 2.14.
The go2ont and go2term helper functions may help you to separate GO sub-ontology and prepare TERM2NAME data.frame, see https://guangchuangyu.github.io/clusterProfiler/#useful-utilities