Question: limma::plotMA displays one sample vs. others
0
4.0 years ago by
Lisa Cohen50
United States
Lisa Cohen50 wrote:

Hello! What is the true definition of MA plot? The limma::plotMA displays Expression log-ratio of one sample vs. others even with NGS platform data.

My understanding is this package was developed for microarrays with two-color data and MA plots then were useful to see one sample/array vs. others. I could not find an explanation for how to look at log-ratio for the mean expression of all NGS platform data in an experiment in the limma user's manual.

Thank you!

Lisa

limma plotma • 1.8k views
modified 4.0 years ago by Gordon Smyth38k • written 4.0 years ago by Lisa Cohen50
Answer: limma::plotMA displays one sample vs. others
2
4.0 years ago by
Gordon Smyth38k
Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia
Gordon Smyth38k wrote:

The following article summarizes the history and definition of MA plots:

See the section on "Mean-difference plots".

Briefly, the concept of an MA plot was created 13-14 years ago by the sma package for a single two colour array, by the rma package for a pair of single channel arrays, and by the limma package for a fitted model or for a matrix of log expression values. These packages provided the original definitions of what an MA plot is.

I'm not quite sure of what you are trying to achieve or why you are having difficulties. To run limma on NGS data you simply run the voom transformation or else use edgeR's cpm function to convert to logCPM. Then you run the analysis similarly as for microarray data.

I think maybe the confusion is in the way the MA plots are used within the different packages, and how the 'difference' on the y-axis is defined. With limma the plotMA function  -- is that meant for use only on two-colour arrays and not RNA-Seq data? The other packages like DESeq2 use the fold change between case an control groups -- is that the generally accepted value for the 'difference' axis?

limma's plotMA is a generic function. It works on all types of data, including two colour microarrays, single channel microarrays, NGS and fitted model objects, and it implements all the concepts of an MA plot. The interpretation of what the M-value is depends on the data type, not so much on the package. If you read the article link I gave above, I think you might understand this better.

The different packages are actually pretty consistent in their interpretation of what an MA plot as far I have seen. They have stuck pretty much to the original concepts introduced by sma and limma long ago. The DESeq and DESeq2 packages have simply copied limma's idea of an MA plot for a fitted model for use with their own fitted models. I think it's unfortunate that they created a function name conflict with the original function in the limma package, but that's another issue.

Answer: limma::plotMA displays one sample vs. others
1
4.0 years ago by
United States
James W. MacDonald50k wrote:

The definition of an MA plot is a scatter plot with geometric mean intensities on the horizontal axis, and log ratios on the vertical axis. It's essentially a Bland-Altman plot comparing two samples. What two samples you choose to use depends on what you are trying to show. If you are trying to show differences between two samples, then you would want to plot the two samples.

I generally don't use MA plots in that way, instead preferring to use them to look for obvious sample-specific problems, in which case I don't want to look at all pairwise comparisons. Instead, I want to look at each sample versus a single representative sample. Rather than arbitrarily picking one sample, I construct a pseudo sample based on the median value for each gene over all samples, and then do MA plots against that. Something like

 maplot <- function (object){

## this could be better
if (is(object, "ExpressionSet")) {
mat <- exprs(object)
}
else {
mat <- as.matrix(object)
}
med <- apply(mat, 1, median, na.rm = TRUE)
M <- mat - med
A <- (mat + med)/2
df <- data.frame(M = as.vector(M), A = as.vector(A), Id = colnames(mat)[col(mat)])
g <- ggplot(df, aes(A, M)) + geom_point(size = 0.05) + facet_wrap(~Id)
g
}



Hi Jim,

You probably already know this, but your maplot() function makes the same plot as limma's plotMA function did way back in 2003 -- except for the use of ggplot() instead of plot(). These days, limma's plotMA function is more careful to make sure that the average sample doesn't share any technical correlation with the target sample. Also I stopped using medians and went back to means.