Question: lmFit very slow if there are missing values
0
gravatar for Frederik Ziebell
3 months ago by
EMBL Heidelberg
Frederik Ziebell0 wrote:

Having only a single missing value slows lmFit down by over an order of magnitude:

library("limma")
library("tictoc")

n_genes <- 10^6

sd <- 0.3*sqrt(4/rchisq(n_genes,df=4))
y <- matrix(rnorm(n_genes*6,sd=sd),n_genes,6)
y[1:2,4:6] <- y[1:2,4:6] + 2
design <- cbind(Grp1=1,Grp2vs1=c(0,0,0,1,1,1))

y_NA <- y
y_NA[1,1] <- NA

tic()
fit <- lmFit(y,design)
toc()

tic()
fit <- lmFit(y_NA,design)
toc()

While the first fit takes about 1.1sec, the second needs over a minute. Is this a bug?

limma • 127 views
ADD COMMENTlink modified 3 months ago by Gordon Smyth39k • written 3 months ago by Frederik Ziebell0
Answer: lmFit very slow if there are missing values
0
gravatar for Gordon Smyth
3 months ago by
Gordon Smyth39k
Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia
Gordon Smyth39k wrote:

The timings you give show that lmFit is actually very fast, especially so when there are no NAs. You are the first person ever to view that as a "bug".

lmFit does an intial scan for NAs or weights and, if they are absent, then it runs a special super-fast algorithm that only works when there are no NAs.

ADD COMMENTlink modified 3 months ago • written 3 months ago by Gordon Smyth39k

Thank you for the clarification. The actual dataset I have contains many conditions and so lmFit takes about half an hour, that's why I initially viewed it as a bug. Just out of curiosity, what's the super-fast algorithm that only works if there are no NAs?

ADD REPLYlink written 3 months ago by Frederik Ziebell0

If there are no weights or NAs then the same QR decomposition can be applied to all genes.

Even with NAs, lmFit should still be about 20 times faster than looping through the rows with lm() and summary().

ADD REPLYlink modified 3 months ago • written 3 months ago by Gordon Smyth39k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 164 users visited in the last hour