Entering edit mode
Maciek Sykulski
▴
20
@maciek-sykulski-6549
Last seen 9.6 years ago
Dear Gordon, dear maintainers of limma package,
While writing some code based on the R package limma, Bioconductor
version:
Release 2.13,
I?ve noticed some strange code in topTable, and in subsetting fit[,i]
of
"MArrayLM" object (I?m not sure if this results in any bug).
Assume a following execution:
> fit <- lmFit(data,design,...)
> topTable(eBayes(fit,robust=TRUE), coef=2:3,sort.by="F")
specifically, in the function topTable, the line fit <- eBayes(fit[,
coef])
raises my concerns (see comment below in the code).
topTable <-function(...) {
[...]
if (length(coef) > 1) {
coef <- unique(coef)
if (length(fit$coef[1, coef]) < ncol(fit))
fit <- eBayes(fit[, coef])
# ^ Above reevaluates eBayes without robust=TRUE
# even if we want a robust estimation
[...]
}
Second confusion is about subsetting of fit object fit[, coef].
fit[i,] subsets fit on genes (rows). Then, what kind of subsetting
does
fit[,coef] perform? Is it by samples, or by coefficients?
After inspection I see that fit[,coef] subsets the fit$design matrix
on
samples, not on coefficients (while it does subset on columns from
fit$coefficients).
Inspection of the code responsible for subsetting "MArrayLM" reveals:
subsetListOfArrays <- function(object,i,j,IJ,IX,I,JX)
{
[...]
for(a in IJ) object[[a]] <- object[[a]][i,j,drop=FALSE]
for(a in IX) object[[a]] <- object[[a]][i, ,drop=FALSE]
for(a in I ) object[[a]] <- object[[a]][i]
for(a in JX) object[[a]] <- object[[a]][j, ,drop=FALSE]
#Shouldn`t this line above look more like this?:
#for(a in JX) object[[a]] <- object[[a]][, j,drop=FALSE]
[...]
}
Here is an example of a design matrix (rows correspond to samples,
columns
to fitted coefficients):
> class(fit)
[1] "MArrayLM"
attr(,"package")
[1] "limma"
> fit$design
cluster.1 cluster.2 cluster.3 cluster.4
1 1 1 0 0
2 1 1 0 1
3 1 1 0 1
[...]
42 1 1 0 0
43 1 1 0 0
44 1 0 0 0
[...]
Now, what I?d expect from an operation which subsets coefficients from
a
MArrayLM model
(as in topTable line fit <- eBayes(fit[, coef])) is this
cluster.2 cluster.3
1 1 0
2 1 0
3 1 0
[...]
moreover fit[,2:3]$coefficients is modified exactly in this way.
However, the resulting fit[,2:3]$design is a subset on samples:
> fit[,2:3]$design
cluster.1 cluster.2 cluster.3 cluster.4
2 1 1 0 1
3 1 1 0 1
Is this really what fit <- eBayes(fit[, coef]) inside topTable intends
to
achieve? This may be relevant, since subsetting fit this way
propagates
further inside eBayes function, where it causes classifyTestsF not to
run
(in case of my data), because of fit[,coef]$design being singular.
eBayes <- function (fit, proportion = 0.01, stdev.coef.lim = c(0.1,
4),
trend = FALSE, robust = FALSE, winsor.tail.p = c(0.05, 0.1))
{
[...]
#!!! Below condition always evaluates to false
# because fit$design was malformed before and
# is.fullrank always returns false
if (!is.null(fit$design) && is.fullrank(fit$design)) {
F.stat <- classifyTestsF(fit, fstat.only = TRUE)
fit$F <- as.vector(F.stat)
[...]
}
I?m not pointing to any specific misbehavior/bug, because it seems
that
topTable(eBayes(fit,robust=TRUE), coef=2:3,sort.by="F",number=N)
still somehow produces the same results as my modified:
topTable.fixed(eBayes.fixed(fit,robust=TRUE),coef=2:3,sort.by="F",robu
st=TRUE)
and different from non-robust:
topTable(eBayes(fit,robust=FALSE), coef=2:3,sort.by="F")
So the final question is:
Does fit[,coef] modifies the fit$design matrix properly, in alignment
with
later check is.fullrank(fit$design)?
Attached is a simple example R session with these issues.
Thanks,
Maciek Sykulski
-------------- next part --------------
A non-text attachment was scrubbed...
Name: limma_debug_session.pdf
Type: application/pdf
Size: 75559 bytes
Desc: not available
URL: <https: stat.ethz.ch="" pipermail="" bioconductor="" attachments="" 20140512="" 2762dc29="" attachment.pdf="">