Question: missing value handling in limma
gravatar for xiaocui zhu
14.0 years ago by
xiaocui zhu70
xiaocui zhu70 wrote:
Hi all, I recently used the linear model fit in limma to rank differentially expressed genes between treated vs. control with a data set. The data includes three log2(Treated/Control) replicate sets, and a dyeSwap for each replicate. So the design matrix is c(1,-1,1,-1,1-1). Among the top rank genes, I noticed some of them have only one log2Ratio measurement with the rest being "NA". I set the log2Ratio of a gene to "NA", if its green or red intensity measurement is below background, saturated, low intensity, or non-uniform. I am wondering how the linear model in limma handles missing values and why a gene with only one data point is identified as a high ranking differentially expressed gene. Thank you for your help in advance! Xiaocui [[alternative HTML version deleted]]
ADD COMMENTlink modified 14.0 years ago by Gordon Smyth33k • written 14.0 years ago by xiaocui zhu70
gravatar for Gordon Smyth
14.0 years ago by
Gordon Smyth33k
Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia
Gordon Smyth33k wrote:

It is perfectly possible although very unlikely to a gene with only one non-missing value to be top-ranked (when analyzing two color microarray data). It would have to have an extraordinarily large fold change for this to happen.

limma handles missing values in the usual way for linear models at the lmFit() step. A gene with only one value will get df.residual=0. At the shrinkage step, the residual standard deviation for such a gene will be reset to the consensus value across all genes, and the corresponding degrees of freedom will be df.prior. This is explained in the article Smyth, SAGMB, 2004, cited in the documentation.


PS. For a single channel technology, the gene would have to have 2 non-missing values before it could have a fold change and a p-value.

ADD COMMENTlink modified 2.4 years ago • written 14.0 years ago by Gordon Smyth33k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 136 users visited in the last hour