Question: Obtain the first gene expression heatmaps from GOexpress package gene ontology enrichment analysis results
0
5 weeks ago by
Brazil
Antonio Campos0 wrote:

Hello,

I'm using the GOexpress package for gene ontology enrichment analysis based on the results of a few RNA-Seq assays. I was wondering if there is a simple way of obtain gene expression heatmaps for several GO terms recursively.

Here is a code using the example data mentioned in the GOexpress manual:

library(GOexpress)

data(AlvMac)

Performing gene ontology enrichment analysis with the command below:

AlvMac_results <- GO_analyse(eSet = AlvMac, f = "Treatment")

Obtaining p-values for the enrichment analysis by permutation (warning - relatively slow function):

AlvMac_results.pVal <- pValue_GO(result = AlvMac_results, N = 100)

Filtering results, for obtaining only GO terms of biological process hierarchy with five or more enriched genes:

BP.5 <- subset_scores( result = AlvMac_results.pVal, namespace = "biological_process", total = 5, # requires 5 or more associated genes p.val=0.05)

Finally, below is the function that generates the heatmap for this specific GO term (GO:0034142) given as an example in the manual. How do I plot heatmaps for the first few other GO terms (say 10 or 15 results) automatically?

heatmap_GO(go_id = "GO:0034142", result = BP.5, eSet = AlvMac, cexRow = 0.4, cexCol = 1, cex.main = 1, main.Lsplit = 30)

modified 4 weeks ago by kevin.rue230 • written 5 weeks ago by Antonio Campos0
Answer: Obtain the first gene expression heatmaps from GOexpress package gene ontology e
1
4 weeks ago by
kevin.rue230
University of Oxford
kevin.rue230 wrote:

Hi Antonio,

Thanks for your question and the minimum reproducible example using the documentation.

To plot heatmaps automatically, you can use a for loop that iterates over a vector of GO identifiers. For instance, in the block below, I identify the top 5 most significant GO categories (i.e., sorted by increasing p-value).

library(dplyr)
# Exclude GO categories with less than 1 gene in the data set,
# as the heat map clustering doesn't work for a single gene
plot_go_ids <- head(subset(arrange(BP.5\$GO, p.val), data_count > 1), 5)[["go_id"]]

# Plot the selected GO categories
for (go_id in plot_go_ids) {
heatmap_GO(
go_id = go_id, result = BP.5, eSet = AlvMac,
cexRow = 0.4, cexCol = 1, cex.main = 1, main.Lsplit = 30)
}

# Alternatively, save each plot to a file
for (go_id in plot_go_ids) {
plot_filename <- sprintf("%s.png", gsub(":", "_", go_id))
png(plot_filename, width = 6, height = 6, units = "in", res = 300)
heatmap_GO(
go_id = go_id, result = BP.5, eSet = AlvMac,
cexRow = 0.4, cexCol = 1, cex.main = 1, main.Lsplit = 30)
dev.off()
}


Obviously, the challenge there is that different GO categories will be associated with different numbers of genes, so it is quite likely that cexRow and cexCol need to be optimised for each heat map individually.

Best wishes, Kevin

Hello Kevin, thank you for you feedback.

Best regards