Is there any way to reduce the redundancy of a set of GO terms when working with a non-reference organism? I mean something similar to REVIGO. The problem with the last is that has a limit in the number of GO terms that can be submitted.
I have read the documentation of GoSemSim and it seems that can work only with supported organisms (though I expected that maybe it would have a function like buildGOmap from clusterProfiler). I have tried to work with goProfiles too, but there is no way to make it work without a orgPackage option (though in the examples from the vignettes it is not used nor mentioned).
edit: I finally was able to make goProfiles work with the basicProfile() parameter idType = "GOTermsFrame". This way, it ignores the orgPackage and anotPackage parameters since you provide the GeneID <--> GOID mappings goProfiles needs.
Any advice?
Thank you in advance
ps.: I have already done enrichment analysis (with topGO) and it is not what I mean in this post.
GOSemSim do support non-reference organisms. The organism parameter only used for IC-based methods those depend on information content data within GOSemSim packages. These IC data are species specific and pre-calculated. I may add supports for calculating IC in real time and support non-reference organisms in future.
For Wang's measure, it used the structure of the whole GO DAG graph and organism parameter will be omitted in goSim and mgoSim function. For gene-based calculation, we use organism parameter to mapping gene to GO.
You want to remove redundant of enrichment result, there is no organism restriction if you use Wang's measure implemented in GOSemSim.
Bests,
Guangchuang
For your information, I will implement a function, simplify, to remove redundant GO terms in enrichment result within the clusterProfiler package.