Using the plotting functions of clusterProfiler / enrichplot with enrichment results from other programs
1
2
Entering edit mode
@charlesfoster-17652
Last seen 4 months ago
Australia

Hi all,

I'm a big fan of the many plots produced by the clusterProfiler and enrichplot packages. The functions such as emapplot() require the enrichment results to have been generated using clusterProfiler (and a few other related packages, I think). However, I've carried out GO/KEGG enrichment analyses using GOSeq so I can account for any potential sequence length biases.

Is there a way to to use the various plotting functions with data from other programs like GOSeq? Can we convert the results into the format required for emapplot() etc? The results for GOSeq, at least, are just stored in a data frame with the usual columns (GO term, adjusted p-value, number of genes in each category etc.)

Alternatively, any other useful programs for plotting results would be appreciated.

Thanks!

clusterProfiler enrichplot • 958 views
ADD COMMENT
2
Entering edit mode
@charlesfoster-17652
Last seen 4 months ago
Australia

In case anyone is wondering in the future, I managed to solve the problem. The solution is a bit fiddly, but at least it works.

First, load the clusterProfiler package, then read in your GOSeq enrichment results into a new data frame (in this case it's called results_df). You need to format the dataframe so that the columns match those of an enricher() output. I won't go into how you do that as I'm sure others can figure it out, but the columns need to be:

colnames(results_df) <- c("ID","Description","GeneRatio","BgRatio","pvalue","p.adjust", "qvalue", "geneID","Count")

Next, make a new "enrichResult" object:

my_object <- new("enrichResult",
readable = FALSE,
result = results_df,
pvalueCutoff = 0.05,
pAdjustMethod = "BH",
qvalueCutoff = 0.2,
organism = "UNKNOWN",
ontology = "UNKNOWN",
gene = DE_genes_vector,
keytype = "UNKNOWN",
universe = universe_vector,
gene2Symbol = character(0),
geneSets = geneSets)

where DE_genes is a vector of the names of your DE genes, universe_vector is a vector of the names of all genes that are annotated with GO terms (your enrichment universe). geneSets is a named list, where the names are enriched GO terms and the elements are DE genes annotated with that GO term.

Finally, you should be able to use my_object with all of the nice plotting functions, such as cnetplot(). To make sure, check the class of my_object:

class(my_object) [1] "enrichResult" attr(,"package") [1] "DOSE"

Hope this helps someone else!

ADD COMMENT
0
Entering edit mode

Dear Charles,

Thanks for your question and appreciate your solution.

I'm using genelist from DNA methylation results to search for their GO and KEGG pathways via DAVID web-based tools

Could you please advise the workable command with only genelist?

Really appreciate that. :)

Best wishes, WF

ADD REPLY
0
Entering edit mode

Hi WF,

First comment: DAVID is very out of date now, so you probably shouldn't use it. However, if you do still choose to use DAVID, run your DAVID analyses directly through the clusterProfiler package: https://guangchuangyu.github.io/2015/03/david-functional-analysis-with-clusterprofiler/. That way you don't have to do any data reformatting etc. to do the plots.

Charles

ADD REPLY

Login before adding your answer.

Traffic: 273 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6