Question: In limma's barcode plot, why are the cutoffs for the pale red and blue bands +/-sqrt(2)?
0
3.3 years ago by
Australia/Melbourne/Monash University Bioinformatics Platform
Paul Harrison70 wrote:

limma's barcodeplot function by default draws a pale red rectangle for statistics greater than sqrt(2) and a pale blue rectangle for statistics less than -sqrt(2). What is the reasoning is behind this choice?

Is the expectation that statistics follows a standard normal distribution for genes that aren't differentially expressed, in which case about 16% of such genes would be highlighted?

limma barcodeplot • 568 views
modified 3.3 years ago by Gordon Smyth39k • written 3.3 years ago by Paul Harrison70
Answer: In limma's barcode plot, why are the cutoffs for the pale red and blue bands +/-
3
3.3 years ago by
Gordon Smyth39k
Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia
Gordon Smyth39k wrote:

The default cutoff of sqrt(2) is chosen to agree with the roast() function. When the statistic is chosen to be the z-score equivalent of a moderated t-statistic, then the genes falling in the coloured regions in the barcodeplot will be the same genes that roast() counts when it calculates the proportion of genes contributing to the up and down p-values for the test set.

Why does roast() use sqrt(2)? This is based on an Akaike Information Criterion (AIC) argument. Suppose that you observe a test statistic z for assessing DE for a given gene. Suppose you use AIC to choose between the null model Z ~ N(0,1) and the alternative model Z ~ N(mu, 1), where mu is a parameter to be estimated. Then you will choose the more complex model if and only if abs(z) > sqrt(2).

Note that, in the gene set testing context, we can get a significant result for the gene set even when the genes in the set are not individually significant. When we count genes contributing to a significant result, we want to include all the genes that seem more likely to be DE than not, hence the AIC argument. From this point of view, a p-value of 0.15 is quite acceptable.

Having said all that, the colouring used by barcodeplot() is only intended to be a guide. There is no reason that you can't set the colour cutoff differently for your own problem if you find that more helpful.

ADD COMMENTlink modified 3.3 years ago • written 3.3 years ago by Gordon Smyth39k