I am running a GSEA analysis of differential expression of a deep sequencing RNA dataset, using the ClusterProfiler package in R.
myGSEA.res <- GSEA(mydata.gsea, TERM2GENE=hs_gsea_c2, verbose=FALSE)
myGSEA.df <- as_tibble(myGSEA.res@result)
I was able to obtain the table. However, I need to re-create a heatmap in R, similar to the one that I can obtain on the GSEA software, as shown on this link:
Would the heatmap be for the leading edge of a particular gene set? You can use the ComplexHeatmap package to make publication-quality heatmaps. Read the ComplexHeatmap Complete Reference to learn more. The general steps are as follows:
1) Assuming you have a matrix,
x
, of log2 expression data with samples as columns and genes as rows, subset the matrix to only those genes for a particular leading edge. Call this smaller matrixx2
, for example.2) Use
Heatmap(x2, cluster_columns = FALSE)
to create the basic heatmap. You will need to read through a few of the chapters in that reference book linked above to understand how things work.3) To save the heatmap to a file, use the desired graphics device function, such as
png()
, draw the heatmap, and then close the graphics device withdev.off()
. See this blog post for how to save a heatmap with specific cell and file dimensions.In the GSEA software (Java software), there is a heatmap called "Heat Map of the Top 50 Features," which is the one I added to the link in the original post. Do you know how to reproduce that figure?
Also, from the GSEA output, how would you write the code to create the heatmap?
Based on GSEA User Guide and the original GSEA paper, the "top 50 features" are those that have the highest correlation with the phenotype of interest. As James MacDonald said, this has nothing to do with GSEA. Why not just use the GSEA software if you want to recreate the same output?