Handling NA values in limma's voom design matrix
1
0
Entering edit mode
dr ▴ 10
@dr-9473
Last seen 5 weeks ago
United States

Hi,

I'm trying to use limma's voom for RNA-seq data, where my design matrix contains NA values. Can voom handle these NAs or is the only solution to toss out the samples with these NAs?

Here's my example data:

set.seed(1)
counts.mat <- matrix(as.integer(runif(1000*10, 10,1000)),nrow = 1000, ncol = 10, dimnames = list(paste0("g",1:1000),paste0("s",1:10)))
design.df <- data.frame(id = paste0("s",1:10),
sex = sample(c("female","male"), 10, replace = T),
age = sample(c(5, 7, 10), 10, replace = T))
design.df$sex <- NA design.df$age <- NA

design.mat <- model.matrix(~ id + age + sex, model.frame(~ ., design.df, na.action = na.pass))
dge <- edgeR::DGEList(counts = counts.mat)
dge <- edgeR::calcNormFactors(dge)
voom.obj <- limma::voom(dge, design.mat, plot=TRUE)


The last last gives this error message:

Error in qr.default(x) : NA/NaN/Inf in foreign function call (arg 1)


Thanks!

limma voom design NAs • 140 views
0
Entering edit mode
2
Entering edit mode
@gordon-smyth
Last seen 1 hour ago
WEHI, Melbourne, Australia

No, NAs are not allowed in the design matrix. If we did permit that, then limma would have no choice but to remove any sample with any NA values, which could give unpredictable and unstable behavior. I prefer the analyst to resolve issues with NA values in the predictors before the analysis.

is the only solution to toss out the samples with these NAs?

Other solutions would be to remove sex and age from the linear model, or to impute the missing values, or to code NA-gender as a third level of the sex factor. In the case of sex, it is quite easy to determine the correct gender from the RNA-seq data itself (https://www.biostars.org/p/9520822/). These are data management considerations independent of limma.