Gene symbols on MA-plot
2
0
Entering edit mode
@gabriel-nathan-kaufman-mr-4461
Last seen 6.4 years ago

I would like to include gene symbols on a MA plot in the same manner as is implemented in the volcano plot command. I have been using the plotMA command from the limma package, but the best workaround I have seen involves extracting the point coordinates using the locator function and labelling them one by one. Is there any better solution?

-----------------------

Here are representative plot commands:

volcanoplot(data.fit.eb, coef = "IVIgiTreg_vs_n_iTreg", highlight = 20, names = GeneSymbol, main = "IVIgiTreg vs n_iTreg")

This gives a plot with labels.

plotMA(data.fit.eb, coef = "IVIgiTreg_vs_n_iTreg", xlab = "Average log-expression", ylab = "Log Fold Change", main = "IVIg_iTreg vs n_iTreg")

This gives a plot without labels, and I am not sure how to add them. I was thinking of using the text function, but how to  automatically extract the coordinates of those genes that are most differentially expressed (e.g. the top 20 log fold-changes)?

The plot images can be found here in PDF files:

https://www.dropbox.com/sh/a4ykjkbp8167bry/AACDZ7gpasj7bj9jiu17ycPOa?dl=0

-----------------------

A related question: is it possible to obtain a smoothScatter-like density gradient coloration in these plots? I have seen this implemented in some plots generated from the oligo package but not from limma plots...

-----------------------

My sessionInfo() output is provided here:

R version 3.2.1 (2015-06-18)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X 10.10.4 (Yosemite)

locale:
[1] en_CA.UTF-8/en_CA.UTF-8/en_CA.UTF-8/C/en_CA.UTF-8/en_CA.UTF-8

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] mogene20sttranscriptcluster.db_8.3.1 org.Mm.eg.db_3.1.2
[3] pd.mogene.2.0.st_3.14.1              limma_3.24.12
[5] annotate_1.46.0                      XML_3.98-1.2
[7] hgu133plus2.db_3.1.3                 org.Hs.eg.db_3.1.2
[9] AnnotationDbi_1.30.1                 GenomeInfoDb_1.4.1
[11] pd.hg.u133.plus.2_3.12.0             RSQLite_1.0.0
[13] DBI_0.3.1                            oligo_1.32.0
[15] Biostrings_2.36.1                    XVector_0.8.0
[17] IRanges_2.2.5                        S4Vectors_0.6.1
[19] Biobase_2.28.0                       oligoClasses_1.30.0
[21] BiocGenerics_0.14.0

loaded via a namespace (and not attached):
[1] affxparser_1.40.0     GenomicRanges_1.20.5  splines_3.2.1         zlibbioc_1.14.0
[5] bit_1.1-12            xtable_1.7-4          foreach_1.4.2         tools_3.2.1
[9] ff_2.2-13             iterators_1.0.7       preprocessCore_1.30.0 affyio_1.36.0
[13] codetools_0.2-11      BiocInstaller_1.18.3 
limma plot.MA volcanoplot oligo • 4.2k views
7
Entering edit mode
@gordon-smyth
Last seen 5 hours ago
WEHI, Melbourne, Australia

You can add features to the plot after creating it. To add gene symbols for the 20 most significant genes:

o <- order(data.fit.eb$p.value[,"IVIgiTreg_vs_n_iTreg"]) x <- data.fit.eb$Amean
y <- data.fit.eb$coefficients[,"IVIgiTreg_vs_n_iTreg"] G <- data.fit.eb$genes\$GeneSymbol
text(x[o[1:20], y[o[1:20], labels=G[o[1:20]])

We haven't made this a built-in feature of plotMA or plotMD because different people want so many different things, and because we find that publication quality labels usually have to be done using pdf publishing software anyway.

0
Entering edit mode

Hi Dr Smyth,

Thanks for your clarification and command strings. However, this order() command sorts based on the p.value. Is there any way to sort on the LogFC values, which end up being the y-coordinates of the MA-plot? In other words, what slot of the data.fit.eb object should I be calling?

0
Entering edit mode

The MA-plot shows logFC on the y-axis, and that's what y is in my code.

0
Entering edit mode
svlachavas ▴ 780
@svlachavas-7225
Last seen 9 days ago
Germany/Heidelberg/German Cancer Resear…

Dear Gabriel,

i'm not sure if can be used for the MAplot you are mentioning, but you could check this vignette:

http://www.gettinggeneticsdone.com/2014/05/r-volcano-plots-to-visualize-rnaseq-microarray.html

It uses the packag calibrate to label the genes of interest in the volcano plot-although i believe that it suit best to a maximum of top20 genes to be shown clearly in any plot