Question

Gene symbols on MA-plot

0

Entering edit mode

Gabriel Nathan Kaufman, Mr ▴ 60

@gabriel-nathan-kaufman-mr-4461

Last seen 8.8 years ago

Canada

I would like to include gene symbols on a MA plot in the same manner as is implemented in the volcano plot command. I have been using the plotMA command from the limma package, but the best workaround I have seen involves extracting the point coordinates using the locator function and labelling them one by one. Is there any better solution?

-----------------------

Here are representative plot commands:

volcanoplot(data.fit.eb, coef = "IVIgiTreg_vs_n_iTreg", highlight = 20, names = GeneSymbol, main = "IVIgiTreg vs n_iTreg")

This gives a plot with labels.

plotMA(data.fit.eb, coef = "IVIgiTreg_vs_n_iTreg", xlab = "Average log-expression", ylab = "Log Fold Change", main = "IVIg_iTreg vs n_iTreg")

This gives a plot without labels, and I am not sure how to add them. I was thinking of using the text function, but how to automatically extract the coordinates of those genes that are most differentially expressed (e.g. the top 20 log fold-changes)?

The plot images can be found here in PDF files:

https://www.dropbox.com/sh/a4ykjkbp8167bry/AACDZ7gpasj7bj9jiu17ycPOa?dl=0

-----------------------

A related question: is it possible to obtain a smoothScatter-like density gradient coloration in these plots? I have seen this implemented in some plots generated from the oligo package but not from limma plots...

-----------------------

My sessionInfo() output is provided here:

R version 3.2.1 (2015-06-18)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X 10.10.4 (Yosemite)

locale:
[1] en_CA.UTF-8/en_CA.UTF-8/en_CA.UTF-8/C/en_CA.UTF-8/en_CA.UTF-8

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] mogene20sttranscriptcluster.db_8.3.1 org.Mm.eg.db_3.1.2                  
 [3] pd.mogene.2.0.st_3.14.1              limma_3.24.12                       
 [5] annotate_1.46.0                      XML_3.98-1.2                        
 [7] hgu133plus2.db_3.1.3                 org.Hs.eg.db_3.1.2                  
 [9] AnnotationDbi_1.30.1                 GenomeInfoDb_1.4.1                  
[11] pd.hg.u133.plus.2_3.12.0             RSQLite_1.0.0                       
[13] DBI_0.3.1                            oligo_1.32.0                        
[15] Biostrings_2.36.1                    XVector_0.8.0                       
[17] IRanges_2.2.5                        S4Vectors_0.6.1                     
[19] Biobase_2.28.0                       oligoClasses_1.30.0                 
[21] BiocGenerics_0.14.0                 

loaded via a namespace (and not attached):
 [1] affxparser_1.40.0     GenomicRanges_1.20.5  splines_3.2.1         zlibbioc_1.14.0      
 [5] bit_1.1-12            xtable_1.7-4          foreach_1.4.2         tools_3.2.1          
 [9] ff_2.2-13             iterators_1.0.7       preprocessCore_1.30.0 affyio_1.36.0        
[13] codetools_0.2-11      BiocInstaller_1.18.3

limma plot.MA volcanoplot oligo • 5.7k views

ADD COMMENT • link updated 8.8 years ago by Gordon Smyth 50k • written 8.8 years ago by Gabriel Nathan Kaufman, Mr ▴ 60

score 7 · Answer 1 · 2015-07-05

7

Entering edit mode

Gordon Smyth 50k

@gordon-smyth

Last seen 3 hours ago

WEHI, Melbourne, Australia

You can add features to the plot after creating it. To add gene symbols for the 20 most significant genes:

o <- order(data.fit.eb$p.value[,"IVIgiTreg_vs_n_iTreg"])
x <- data.fit.eb$Amean
y <- data.fit.eb$coefficients[,"IVIgiTreg_vs_n_iTreg"]
G <- data.fit.eb$genes$GeneSymbol
text(x[o[1:20], y[o[1:20], labels=G[o[1:20]])

We haven't made this a built-in feature of plotMA or plotMD because different people want so many different things, and because we find that publication quality labels usually have to be done using pdf publishing software anyway.

ADD COMMENT • link 8.8 years ago Gordon Smyth 50k

0

Entering edit mode

Hi Dr Smyth,

Thanks for your clarification and command strings. However, this order() command sorts based on the p.value. Is there any way to sort on the LogFC values, which end up being the y-coordinates of the MA-plot? In other words, what slot of the data.fit.eb object should I be calling?

ADD REPLY • link 8.8 years ago Gabriel Nathan Kaufman, Mr ▴ 60

0

Entering edit mode

The MA-plot shows logFC on the y-axis, and that's what y is in my code.

For more information, type help("MArrayLM-class").

ADD REPLY • link 8.8 years ago Gordon Smyth 50k

score 0 · Answer 2 · 2015-07-05

Dear Gabriel,

i'm not sure if can be used for the MAplot you are mentioning, but you could check this vignette:

http://www.gettinggeneticsdone.com/2014/05/r-volcano-plots-to-visualize-rnaseq-microarray.html

It uses the packag calibrate to label the genes of interest in the volcano plot-although i believe that it suit best to a maximum of top20 genes to be shown clearly in any plot