Question: Using the plotting functions of clusterProfiler / enrichplot with enrichment results from other programs
1
5 months ago by
charles.foster40 wrote:

Hi all,

I'm a big fan of the many plots produced by the clusterProfiler and enrichplot packages. The functions such as emapplot() require the enrichment results to have been generated using clusterProfiler (and a few other related packages, I think). However, I've carried out GO/KEGG enrichment analyses using GOSeq so I can account for any potential sequence length biases.

Is there a way to to use the various plotting functions with data from other programs like GOSeq? Can we convert the results into the format required for emapplot() etc? The results for GOSeq, at least, are just stored in a data frame with the usual columns (GO term, adjusted p-value, number of genes in each category etc.)

Alternatively, any other useful programs for plotting results would be appreciated.

Thanks!

clusterprofiler enrichplot • 185 views
modified 5 months ago • written 5 months ago by charles.foster40
Answer: Using the plotting functions of clusterProfiler / enrichplot with enrichment res
1
5 months ago by
charles.foster40 wrote:

In case anyone is wondering in the future, I managed to solve the problem. The solution is a bit fiddly, but at least it works.

First, load the clusterProfiler package, then read in your GOSeq enrichment results into a new data frame (in this case it's called results_df). You need to format the dataframe so that the columns match those of an enricher() output. I won't go into how you do that as I'm sure others can figure it out, but the columns need to be:

colnames(results_df) <- c("ID","Description","GeneRatio","BgRatio","pvalue","p.adjust", "qvalue", "geneID","Count")

Next, make a new "enrichResult" object:

my_object <- new("enrichResult",
result = results_df,
pvalueCutoff = 0.05,
qvalueCutoff = 0.2,
organism = "UNKNOWN",
ontology = "UNKNOWN",
gene = DE_genes_vector,
keytype = "UNKNOWN",
universe = universe_vector,
gene2Symbol = character(0),
geneSets = geneSets)


where DE_genes is a vector of the names of your DE genes, universe_vector is a vector of the names of all genes that are annotated with GO terms (your enrichment universe). geneSets is a named list, where the names are enriched GO terms and the elements are DE genes annotated with that GO term.

Finally, you should be able to use my_object with all of the nice plotting functions, such as cnetplot(). To make sure, check the class of my_object:

class(my_object) [1] "enrichResult" attr(,"package") [1] "DOSE"

Hope this helps someone else!