Obtain the first gene expression heatmaps from GOexpress package gene ontology enrichment analysis results
1
0
Entering edit mode
@antonio-campos-22114
Last seen 5.1 years ago
Brazil

Hello,

I'm using the GOexpress package for gene ontology enrichment analysis based on the results of a few RNA-Seq assays. I was wondering if there is a simple way of obtain gene expression heatmaps for several GO terms recursively.

Here is a code using the example data mentioned in the GOexpress manual:

library(GOexpress)

data(AlvMac)

Performing gene ontology enrichment analysis with the command below:

AlvMac_results <- GO_analyse(eSet = AlvMac, f = "Treatment")

Obtaining p-values for the enrichment analysis by permutation (warning - relatively slow function):

AlvMac_results.pVal <- pValue_GO(result = AlvMac_results, N = 100)

Filtering results, for obtaining only GO terms of biological process hierarchy with five or more enriched genes:

BP.5 <- subset_scores( result = AlvMac_results.pVal, namespace = "biological_process", total = 5, # requires 5 or more associated genes p.val=0.05)

Finally, below is the function that generates the heatmap for this specific GO term (GO:0034142) given as an example in the manual. How do I plot heatmaps for the first few other GO terms (say 10 or 15 results) automatically?

heatmap_GO(go_id = "GO:0034142", result = BP.5, eSet = AlvMac, cexRow = 0.4, cexCol = 1, cex.main = 1, main.Lsplit = 30)

Thank you for your attention.

GOexpress heatmap gene ontology plot • 1.5k views
ADD COMMENT
1
Entering edit mode
kevin.rue ▴ 350
@kevinrue-6757
Last seen 6 months ago
University of Oxford

Hi Antonio,

Thanks for your question and the minimum reproducible example using the documentation.

To plot heatmaps automatically, you can use a for loop that iterates over a vector of GO identifiers. For instance, in the block below, I identify the top 5 most significant GO categories (i.e., sorted by increasing p-value).

library(dplyr)
# Exclude GO categories with less than 1 gene in the data set,
# as the heat map clustering doesn't work for a single gene
plot_go_ids <- head(subset(arrange(BP.5$GO, p.val), data_count > 1), 5)[["go_id"]]

# Plot the selected GO categories
for (go_id in plot_go_ids) {
    heatmap_GO(
        go_id = go_id, result = BP.5, eSet = AlvMac,
        cexRow = 0.4, cexCol = 1, cex.main = 1, main.Lsplit = 30)
}

# Alternatively, save each plot to a file
for (go_id in plot_go_ids) {
    plot_filename <- sprintf("%s.png", gsub(":", "_", go_id))
    png(plot_filename, width = 6, height = 6, units = "in", res = 300)
    heatmap_GO(
        go_id = go_id, result = BP.5, eSet = AlvMac,
        cexRow = 0.4, cexCol = 1, cex.main = 1, main.Lsplit = 30)
    dev.off()
}

Obviously, the challenge there is that different GO categories will be associated with different numbers of genes, so it is quite likely that cexRow and cexCol need to be optimised for each heat map individually.

Best wishes, Kevin

ADD COMMENT
0
Entering edit mode

Hello Kevin, thank you for you feedback.

Best regards

ADD REPLY

Login before adding your answer.

Traffic: 869 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6