Question

Obtain the first gene expression heatmaps from GOexpress package gene ontology enrichment analysis results

0

Entering edit mode

Antonio Campos • 0

@antonio-campos-22114

Last seen 4.5 years ago

Brazil

Hello,

I'm using the GOexpress package for gene ontology enrichment analysis based on the results of a few RNA-Seq assays. I was wondering if there is a simple way of obtain gene expression heatmaps for several GO terms recursively.

Here is a code using the example data mentioned in the GOexpress manual:

library(GOexpress)

data(AlvMac)

Performing gene ontology enrichment analysis with the command below:

AlvMac_results <- GO_analyse(eSet = AlvMac, f = "Treatment")

Obtaining p-values for the enrichment analysis by permutation (warning - relatively slow function):

AlvMac_results.pVal <- pValue_GO(result = AlvMac_results, N = 100)

Filtering results, for obtaining only GO terms of biological process hierarchy with five or more enriched genes:

BP.5 <- subset_scores( result = AlvMac_results.pVal, namespace = "biological_process", total = 5, # requires 5 or more associated genes p.val=0.05)

Finally, below is the function that generates the heatmap for this specific GO term (GO:0034142) given as an example in the manual. How do I plot heatmaps for the first few other GO terms (say 10 or 15 results) automatically?

heatmap_GO(go_id = "GO:0034142", result = BP.5, eSet = AlvMac, cexRow = 0.4, cexCol = 1, cex.main = 1, main.Lsplit = 30)

Thank you for your attention.

GOexpress heatmap gene ontology plot • 1.2k views

ADD COMMENT • link updated 4.5 years ago by kevin.rue ▴ 350 • written 4.5 years ago by Antonio Campos • 0

score 1 · Accepted Answer · 2019-10-11

Hi Antonio,

Thanks for your question and the minimum reproducible example using the documentation.

To plot heatmaps automatically, you can use a for loop that iterates over a vector of GO identifiers. For instance, in the block below, I identify the top 5 most significant GO categories (i.e., sorted by increasing p-value).

library(dplyr)
# Exclude GO categories with less than 1 gene in the data set,
# as the heat map clustering doesn't work for a single gene
plot_go_ids <- head(subset(arrange(BP.5$GO, p.val), data_count > 1), 5)[["go_id"]]

# Plot the selected GO categories
for (go_id in plot_go_ids) {
    heatmap_GO(
        go_id = go_id, result = BP.5, eSet = AlvMac,
        cexRow = 0.4, cexCol = 1, cex.main = 1, main.Lsplit = 30)
}

# Alternatively, save each plot to a file
for (go_id in plot_go_ids) {
    plot_filename <- sprintf("%s.png", gsub(":", "_", go_id))
    png(plot_filename, width = 6, height = 6, units = "in", res = 300)
    heatmap_GO(
        go_id = go_id, result = BP.5, eSet = AlvMac,
        cexRow = 0.4, cexCol = 1, cex.main = 1, main.Lsplit = 30)
    dev.off()
}

Obviously, the challenge there is that different GO categories will be associated with different numbers of genes, so it is quite likely that cexRow and cexCol need to be optimised for each heat map individually.

Best wishes, Kevin