Entering edit mode

Wolfgang Huber
13k

@wolfgang-huber-3550
Last seen 11 weeks ago

EMBL European Molecular Biology Laboratâ€¦

Remark:
This is the reply to a question that Martin had sent to the
bioconductor
mailing list with CC to myself. Because of a large attachment (example
data) however that message had not been distributed on the list. In
the
following, first the question, then my answer. - Wolfgang
Martin Kerick wrote:
> Dear Wolfgang,
>
> I have a question concerning your vsn package. I encounter the
following
> error when applying vsn on one array:
>
> vsn: 27648 x 2 matrix (1 stratum). Please wait for 10 dots: Error:
L-BFGS-B
> needs finite values of fn
>
> I have read the Bioconductor archives and found the following
statements:
>
>
>>does your data matrix contain Inf (infinity) or an excessive number
of 0s
>
> (e.g. through "flooring" the negative values?).
>
>>If there are infinities in the data, this will probably also lead to
an
>
> infinite likelihood, which could explain your error message.
>
>>If there are other singularities (e.g. if a whole column of the data
matrix
>
> has the same value), this may also lead to infinite values in the
likelihood
> calculations.
>
> It seems to me, that none of the above applies to my data. I had
some values
> occurring multiple (2-6) times in one column and corrected for that,
but the
> error remained. Since I don't use any background correction I assume
that
> negative or "floored" values are probably not the problem.
> I am using vsn 1.4.11 and arrayMagic 1.3.7 on R 1.9.0
> I attached the data file leading to the crash.
> Any help would be greatly appreciated,
> Kind regards,
> Martin
Hi Martin,
this is indeed one of the rare cases where the iterative algorithm in
vsn does not converge with default start parameters. By specifying
different start parameters (see the code example below), it does
converge, apparently to a reasonable result.
In your data, the F635 and F532 intensities from the two color
channels
are quite unbalanced (both w.r.t. background and slope) - my
impression
is that you would do much better if you used background subtraction
(see
code example). Also, in that case vsn does indeed work with default
settings.
If you're worried about too much variability in the "background"
intensities, some spatial smoothing might help.
Running vsn involves involves the maximization of a likelihood
function
that is not parabolic, but usually concave. In rare cases, the
numerical
optimizer runs into nirwana before finding the optimum. In these
cases,
choosing a different start value may help. In the case of the example
data that you provided, one might argue that (without background
subtraction) it also involves a quality problem.
Best regards
Wolfgang
-------------------------------------
Wolfgang Huber
Division of Molecular Genome Analysis
German Cancer Research Center
Heidelberg, Germany
Phone: +49 6221 424709
Fax: +49 6221 42524709
Http: www.dkfz.de/abt0840/whuber
-------------------------------------
library(vsn)
dat <- read.table("test1", header=TRUE, sep="\t")
print(dim(dat))
par(mfrow=c(2,2))
maplot <- function(x, ...) {
stopifnot(is.matrix(x), ncol(x)==2)
plot(rowMeans(x), x[,2]-x[,1], pch=".", xlab="A", ylab="M", ...)
abline(h=0, col="red")
}
y <- as.matrix(dat[, c("F532", "F635")])
## Try this!
## y <- as.matrix(dat[, c("F532", "F635")]-dat[, c("B532", "B635")])
plot(y,pch=".",xlim=c(7,20),ylim=c(7,20))
abline(a=0,b=1,col="red")
maplot(log(y))
pstart <- array(c(0, 0, 1, 1), dim=c(1,2,2))
## ny <- vsn(y) ## will produce an error
ny <- vsn(y, pstart=pstart)
cols <- c("red", "black")[1+preproc(description(ny))$vsnTrimSelection]
maplot(exprs(ny), col=cols)