Could someone help me interpret (develop an action plan to correct
...) the
error message that follows?
Thanks.
> Monkey.sub<-Monkey.expr[!is.na(Monkey.matrix[,1]),1]
> Monkey.sub
Expression Set (exprSet) with
13838 genes
1 samples
phenoData object with 6 variables and 1 cases
varLabels
: Slide
: FileName
: Cy3
: Cy5
: date
: Comments
> Monkey.sub<-exprs(Monkey.expr[!is.na(Monkey.matrix[,1]),1])
> Monkey.vsn<-vsn(Monkey.sub)
vsn is working on a 13838 x 1 matrix, with lts.quantile=0.5; please
wait for
11 dots:
.Error in optim(par = p0, fn = ll, gr = grll, method = "L-BFGS-B",
control =
control, :
L-BFGS-B needs finite values of fn
[[alternative HTML version deleted]]
Hi Charles,
vsn is a normalization method that brings the different columns
(colors,
arrays) of an expression matrix on the same scale. As input, it takes
an
n*d matrix, with d>=2. You passed it a matrix with d=1. Apparently
this
results in some of the likelihood calculations becoming singular,
hence
the error message you received.
Action plan:
1. For you: read the paper on vsn, then call it with expression
matrices
of size d>=2.
2. For me: fix vsn so that it throws an intelligible error message if
called with d<=1.
Best regards
Wolfgang
-------------------------------------
Wolfgang Huber
Division of Molecular Genome Analysis
German Cancer Research Center
Heidelberg, Germany
Phone: +49 6221 424709
Fax: +49 6221 42524709
Http: www.dkfz.de/mga/whuber
-------------------------------------
On Thu, 17 Jul 2003, White, Charles E WRAIR-Wash DC wrote:
> Could someone help me interpret (develop an action plan to correct
...) the
> error message that follows?
>
>
>
> Thanks.
>
>
>
> > Monkey.sub<-Monkey.expr[!is.na(Monkey.matrix[,1]),1]
>
> > Monkey.sub
>
> Expression Set (exprSet) with
>
> 13838 genes
>
> 1 samples
>
> phenoData object with 6 variables and 1 cases
>
> varLabels
>
> : Slide
>
> : FileName
>
> : Cy3
>
> : Cy5
>
> : date
>
> : Comments
>
> > Monkey.sub<-exprs(Monkey.expr[!is.na(Monkey.matrix[,1]),1])
>
> > Monkey.vsn<-vsn(Monkey.sub)
>
> vsn is working on a 13838 x 1 matrix, with lts.quantile=0.5; please
wait for
> 11 dots:
>
> .Error in optim(par = p0, fn = ll, gr = grll, method = "L-BFGS-B",
control =
> control, :
>
> L-BFGS-B needs finite values of fn
>
>
> [[alternative HTML version deleted]]
>
Hi Wolfgang:
I am also getting the same error with a matrix that is 4992 x 376.
Here
are the R commands:
> data <- read.table("file.msk", header=T, sep = "\t", row.names=1)
> data <- as.matrix(data)
> vsn.data <- vsn(data)
vsn is working on a 4992 x 376 matrix, with lts.quantile=0.5; please
wait for 11 dots:
.
and then dies. Any suggestions?
after I do a traceback I get the following:
> traceback()
2: optim(par = p0, fn = ll, gr = grll, method = "L-BFGS-B", control =
control,
lower = plower)
1: vsn(data)
Any help is greatly appreciated.
Isaac
w.huber@dkfz-heidelberg.de wrote:
>Hi Charles,
>
>vsn is a normalization method that brings the different columns
(colors,
>arrays) of an expression matrix on the same scale. As input, it takes
an
>n*d matrix, with d>=2. You passed it a matrix with d=1. Apparently
this
>results in some of the likelihood calculations becoming singular,
hence
>the error message you received.
>
>Action plan:
>1. For you: read the paper on vsn, then call it with expression
matrices
>of size d>=2.
>
>2. For me: fix vsn so that it throws an intelligible error message if
>called with d<=1.
>
>Best regards
> Wolfgang
>
>-------------------------------------
>Wolfgang Huber
>Division of Molecular Genome Analysis
>German Cancer Research Center
>Heidelberg, Germany
>Phone: +49 6221 424709
>Fax: +49 6221 42524709
>Http: www.dkfz.de/mga/whuber
>-------------------------------------
>
>
>On Thu, 17 Jul 2003, White, Charles E WRAIR-Wash DC wrote:
>
>
>
>>Could someone help me interpret (develop an action plan to correct
...) the
>>error message that follows?
>>
>>
>>
>>Thanks.
>>
>>
>>
>>
>>
>>>Monkey.sub<-Monkey.expr[!is.na(Monkey.matrix[,1]),1]
>>>
>>>
>>>Monkey.sub
>>>
>>>
>>Expression Set (exprSet) with
>>
>> 13838 genes
>>
>> 1 samples
>>
>> phenoData object with 6 variables and 1 cases
>>
>> varLabels
>>
>> : Slide
>>
>> : FileName
>>
>> : Cy3
>>
>> : Cy5
>>
>> : date
>>
>> : Comments
>>
>>
>>
>>>Monkey.sub<-exprs(Monkey.expr[!is.na(Monkey.matrix[,1]),1])
>>>
>>>
>>>Monkey.vsn<-vsn(Monkey.sub)
>>>
>>>
>>vsn is working on a 13838 x 1 matrix, with lts.quantile=0.5; please
wait for
>>11 dots:
>>
>>.Error in optim(par = p0, fn = ll, gr = grll, method = "L-BFGS-B",
control =
>>control, :
>>
>> L-BFGS-B needs finite values of fn
>>
>>
>> [[alternative HTML version deleted]]
>>
>>
>>
>
>_______________________________________________
>Bioconductor mailing list
>Bioconductor@stat.math.ethz.ch
>https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor
>
>
I spent the weekend getting to know this program better than I wanted
<grin>, but I probably still don't know it well enough. The fool's
gold of
my wisdom is as follows:
1) I would seriously consider reducing the amount of data you feed
this
program. It took 4.5 hours to process a 15,552 x 38 matrix on a 1.2
GHz
Pentium III. There is a reason why the function vsnh exists. Unless
you have
some serious GHz, you probably want to run vsn on a random sample of
genes
or on one array at a time.
2) Assuming that you are using data from a two channel microarray, I
strongly suspect that the red and green channels need to be side by
side in
your matrix. I think the point is to quantify measurement variation
without
contamination from any unnecessary source. I don't see any other way
that
pair information is being passed to vsn.
3) I think that your problem and my old problem are likely to be quite
different. I fed the program data in a format it didn't understand and
you
probably fed the program more data than it could process in a
reasonable
amount of time. (Since the program doesn't use "much" memory, you
wouldn't
have heard the hard drive running even if the program was still
running.)
4) I am pleased with the results I'm now getting from vsn. My initial
problems with this program were related to how I understand the
relationships between data elements and Bioconductor objects verses
what
appears to be a somewhat different relationship in vsn.
-----Original Message-----
From: Isaac Neuhaus [mailto:isaac.neuhaus@bms.com]
Sent: Monday, July 21, 2003 1:37 PM
To: w.huber@dkfz-heidelberg.de
Cc: White, Charles E WRAIR-Wash DC; 'bioconductor@stat.math.ethz.ch'
Subject: Re: [BioC] vsn in BioConductor 1.2
Hi Wolfgang:
I am also getting the same error with a matrix that is 4992 x 376.
Here
are the R commands:
> data <- read.table("file.msk", header=T, sep = "\t", row.names=1)
> data <- as.matrix(data)
> vsn.data <- vsn(data)
vsn is working on a 4992 x 376 matrix, with lts.quantile=0.5; please
wait for 11 dots:
.
and then dies. Any suggestions?
after I do a traceback I get the following:
> traceback()
2: optim(par = p0, fn = ll, gr = grll, method = "L-BFGS-B", control =
control,
lower = plower)
1: vsn(data)
Any help is greatly appreciated.
Isaac
w.huber@dkfz-heidelberg.de wrote:....
Hi Charles,
> 1) I would seriously consider reducing the amount of data you feed
this
> program. It took 4.5 hours to process a 15,552 x 38 matrix on a 1.2
GHz
> Pentium III. There is a reason why the function vsnh exists. Unless
you have
> some serious GHz, you probably want to run vsn on a random sample of
genes
> or on one array at a time.
The program is indeed quite slow. The run time is about
t = c * no.rows * no. columns
and according to your numbers c = about 3ms on your machine. There is
a
lot of number crunching in vsn. With Dennis Kostka I have an
experimental
version that is written in C, but even that is "only" faster by a
factor
of 2-3.
A good strategy is indeed to run the program on a random sample of
genes
(rows), and then use vsnh to apply the transformation to the whole
data
matrix. See normalize.AffyBatch.vsn for an example. A random subset of
a
few thousand spots should usually do. It will not be helpful to split
up
the task by arrays (e.g. one array at a time) since the net run time
will
be the same.
> 2) Assuming that you are using data from a two channel microarray, I
> strongly suspect that the red and green channels need to be side by
side in
> your matrix. I think the point is to quantify measurement variation
without
> contamination from any unnecessary source. I don't see any other way
that
> pair information is being passed to vsn.
If you pass a 2*k data matrix from k red/green slides, with the colors
next to each other, vsn does not care about the ordering of the
columns -
so it does not a make a difference whether the columns are ordered R1,
G1,
R1, G2, ... Gk or R1,... Rk, G1, ... Gk. If someone is not confortable
with this, they can also call in vsn in turn for each array
separately.
Empirically, I've found that this makes hardly a difference. (The
parameter estimation is not affected by the different correlations
within
and between arrays.)
However, there should not be pronounced batch effects (e.g. arrays
1..50
looking technically very different from arrays 51...100).
> 3) I think that your problem and my old problem are likely to be
quite
> different. I fed the program data in a format it didn't understand
and you
> probably fed the program more data than it could process in a
reasonable
> amount of time. (Since the program doesn't use "much" memory, you
wouldn't
> have heard the hard drive running even if the program was still
running.)
Yes. The error message about infinite likelihood has nothing to do
with
the program's long, but finite, CPU time consumption.
> 4) I am pleased with the results I'm now getting from vsn. ...
That's always nice to hear :)
Best regards
Wolfgang
Hi Isaac,
does your data matrix contain Inf (infinity) or an excessive number of
0s
(e.g. through "flooring" the negative values?). If there are
infinities
in the data, this will probably also lead to an infinite likelihood,
which
could explain your error message.
If there are other singularities (e.g. if a whole column of the data
matrix has the same value), this may also lead to infinite values in
the
likelihood calculations.
If these suggestions do not lead to the solution of your problem, you
could send me your data matrix (anonymized) and I could try to figure
out
where things go wrong. The calculations in vsn are not that
complicated.
This may be useful in making it more robust or at least in making it
produce more intelligible error messagess.
Note that I'll be away from my email from now till Thursday.
Best regards
Wolfgang
-------------------------------------
Wolfgang Huber
Division of Molecular Genome Analysis
German Cancer Research Center
Heidelberg, Germany
Phone: +49 6221 424709
Fax: +49 6221 42524709
Http: www.dkfz.de/mga/whuber
-------------------------------------
On Mon, 21 Jul 2003, Isaac Neuhaus wrote:
> Wolfgang:
>
> I chopped some of the ouptup here is everything.
>
> Isaac
>
> > data <- read.table("file.msk", header=T, sep = "\t", row.names=1)
> > data <- as.matrix(data)
> > vsn.data <- vsn(data)
> vsn is working on a 4992 x 376 matrix, with lts.quantile=0.5; please
> wait for 11 dots:
> .Error in optim(par = p0, fn = ll, gr = grll, method = "L-BFGS-B",
> control = control, :
> non-finite value supplied by optim
>
w.huber@dkfz-heidelberg.de wrote:
>Hi Isaac,
>
>does your data matrix contain Inf (infinity) or an excessive number
of 0s
>(e.g. through "flooring" the negative values?). If there are
infinities
>in the data, this will probably also lead to an infinite likelihood,
which
>could explain your error message.
>
Yes, In some cases it contains up to 75% of 0s. I will exclude these
samples and try to run the vsn again.
Thanks for your help.
Isaac
>
>If there are other singularities (e.g. if a whole column of the data
>matrix has the same value), this may also lead to infinite values in
the
>likelihood calculations.
>
>If these suggestions do not lead to the solution of your problem, you
>could send me your data matrix (anonymized) and I could try to figure
out
>where things go wrong. The calculations in vsn are not that
complicated.
>This may be useful in making it more robust or at least in making it
>produce more intelligible error messagess.
>
>Note that I'll be away from my email from now till Thursday.
>
>Best regards
> Wolfgang
>
>-------------------------------------
>Wolfgang Huber
>Division of Molecular Genome Analysis
>German Cancer Research Center
>Heidelberg, Germany
>Phone: +49 6221 424709
>Fax: +49 6221 42524709
>Http: www.dkfz.de/mga/whuber
>-------------------------------------
>
>
>On Mon, 21 Jul 2003, Isaac Neuhaus wrote:
>
>
>
>>Wolfgang:
>>
>>I chopped some of the ouptup here is everything.
>>
>>Isaac
>>
>> > data <- read.table("file.msk", header=T, sep = "\t", row.names=1)
>> > data <- as.matrix(data)
>> > vsn.data <- vsn(data)
>>vsn is working on a 4992 x 376 matrix, with lts.quantile=0.5; please
>>wait for 11 dots:
>>.Error in optim(par = p0, fn = ll, gr = grll, method = "L-BFGS-B",
>>control = control, :
>> non-finite value supplied by optim
>>
>>
>>