Question

[limma] [Rfit] [samr] Gene expression distribution using lmFit and eBayes

1

Entering edit mode

Jérôme Lane ▴ 30

@jerome-lane-6256

Last seen 9.6 years ago

Hi, The 3/4 of my microarray gene expressions have non normal distribution with most of p-values after Shapiro test under 10x-5. I tried linear ranked regression from rfit (no normality assumption for residues) from Rfit package for adjustment of covariables + SAM (non parametric) from samr package but results where not as biologically relevant as lmFit + eBayes could provide. I know that lmFit function can analyses gene expression not strictly normal, but what is the limit ? Is it statistically relevant to use lmFit + eBayes according to my data ? Best regards, J??r??me Lane

Microarray Regression Microarray Regression • 2.2k views

ADD COMMENT • link updated 10.4 years ago by Gordon Smyth 50k • written 10.4 years ago by Jérôme Lane ▴ 30

Gordon Smyth · Answer 1 · 2013-11-23

Dear Jerome,

The Shapiro test is only applicable to iid samples, so it is difficult to see how it could be used to test normality of expression values in a linear modelling context. If you have applied the test to the normalized expression values for each gene, then I suspect that the test is actually picking up differential expression rather than non-normality.

The limma code is very robust against non-normality. All the usual microarray platforms and standard preprocessing procedures produce data that is normally distributed to a good enough approximation. Much effort has been devoted to developing good preprocessing and normalization algorithms.

The concept of "robustness" in statistical analysis goes back a 1953 paper by George Box in Biometrika. In that paper, Box wrote of the "remarkable property of robustness to non-normality which [tests for comparing means] possess". The tests done by limma inherit the robustness property that Box was referring to. Box made the point that the robustness of the two sample t-test was not improved by checking first for equal variances. He said

"To make the preliminary test on variances is rather like putting to sea in a rowing boat to find out whether conditions are sufficiently calm for an ocean liner to leave port!"

I rather think that, if Box was still alive today, he might say something similar about a preliminary Shapiro test!

Best wishes Gordon