Question

limma::plotMA displays one sample vs. others

0

Entering edit mode

Lisa Cohen ▴ 50

@lisa-cohen-6190

Last seen 10.3 years ago

United States

Hello! What is the true definition of MA plot? The limma::plotMA displays Expression log-ratio of one sample vs. others even with NGS platform data.

My understanding is this package was developed for microarrays with two-color data and MA plots then were useful to see one sample/array vs. others. I could not find an explanation for how to look at log-ratio for the mean expression of all NGS platform data in an experiment in the limma user's manual.

Thank you!

Lisa

limma plotMA • 4.6k views

ADD COMMENT • link updated 10.3 years ago by Gordon Smyth 53k • written 10.3 years ago by Lisa Cohen ▴ 50

score 2 · Answer 1 · 2015-08-27

2

Entering edit mode

Gordon Smyth 53k

@gordon-smyth

Last seen 8 hours ago

WEHI, Melbourne, Australia

The following article summarizes the history and definition of MA plots:

http://nar.oxfordjournals.org/content/43/7/e47

See the section on "Mean-difference plots".

Briefly, the concept of an MA plot was created 13-14 years ago by the sma package for a single two colour array, by the rma package for a pair of single channel arrays, and by the limma package for a fitted model or for a matrix of log expression values. These packages provided the original definitions of what an MA plot is.

I'm not quite sure of what you are trying to achieve or why you are having difficulties. To run limma on NGS data you simply run the voom transformation or else use edgeR's cpm function to convert to logCPM. Then you run the analysis similarly as for microarray data.

ADD COMMENT • link 10.3 years ago Gordon Smyth 53k

0

Entering edit mode

I think maybe the confusion is in the way the MA plots are used within the different packages, and how the 'difference' on the y-axis is defined. With limma the plotMA function -- is that meant for use only on two-colour arrays and not RNA-Seq data? The other packages like DESeq2 use the fold change between case an control groups -- is that the generally accepted value for the 'difference' axis?

ADD REPLY • link 10.3 years ago meeta.mistry ▴ 30

0

Entering edit mode

limma's plotMA is a generic function. It works on all types of data, including two colour microarrays, single channel microarrays, NGS and fitted model objects, and it implements all the concepts of an MA plot. The interpretation of what the M-value is depends on the data type, not so much on the package. If you read the article link I gave above, I think you might understand this better.

The different packages are actually pretty consistent in their interpretation of what an MA plot as far I have seen. They have stuck pretty much to the original concepts introduced by sma and limma long ago. The DESeq and DESeq2 packages have simply copied limma's idea of an MA plot for a fitted model for use with their own fitted models. I think it's unfortunate that they created a function name conflict with the original function in the limma package, but that's another issue.

ADD REPLY • link 10.3 years ago Gordon Smyth 53k

score 1 · Answer 2 · 2015-08-27

The definition of an MA plot is a scatter plot with geometric mean intensities on the horizontal axis, and log ratios on the vertical axis. It's essentially a Bland-Altman plot comparing two samples. What two samples you choose to use depends on what you are trying to show. If you are trying to show differences between two samples, then you would want to plot the two samples.

I generally don't use MA plots in that way, instead preferring to use them to look for obvious sample-specific problems, in which case I don't want to look at all pairwise comparisons. Instead, I want to look at each sample versus a single representative sample. Rather than arbitrarily picking one sample, I construct a pseudo sample based on the median value for each gene over all samples, and then do MA plots against that. Something like

 maplot <- function (object){

## this could be better
    if (is(object, "ExpressionSet")) {
        mat <- exprs(object)
    }
    else {
        mat <- as.matrix(object)
    }
    med <- apply(mat, 1, median, na.rm = TRUE)
    M <- mat - med
    A <- (mat + med)/2
    df <- data.frame(M = as.vector(M), A = as.vector(A), Id = colnames(mat)[col(mat)])
    g <- ggplot(df, aes(A, M)) + geom_point(size = 0.05) + facet_wrap(~Id)
    g
}