Question: when do linear models work?

0

Arne.Muller@aventis.com •

**620**wrote:Hello All,
I've two fundamental problems with linear models (lm), maybe you can
help me
to clearify these issues:
1. Irrespective of how many factors you use in your expriment, the
relationship is always assumed to be linear. If you've a response
vector Y
and vector X of indeppendent variables, the Y ~ X basically assumes a
straight line (with some kind of slope). If you do say Y ~ X + Z then
one can
think of the lm as a *flat* surface. The same is true for higher
dimensions
(X ~ dose + time + batch + gender + ... )
This assumtion is realy dangerous I think, since many
treatment/response
relationships are not linear. For example think about an experiment:
I've 5
doses 0.0mM, 0.10mM, 0.25mM, 0.5mM and 1.0mM of a drug with which cell
cultures get treated. The 0.1mM dose causes hardly any change in gene
expression, whereas there's a big difference in gene expression at
0.25mM.
Then at 0.5mM and 1.0mM the reponse is not much stronger than at
0.25mM.
If one just looks at a single gene, then expression of this gene goes
up
quite strongly from 0.1mM to 0.25mM, and then expression flattens out
for the
higher doses. The response reaches saturation. Other resposnes are
more like
a logistic curve. This is a typical scenario.
The problem is that many genes within one experiment behave like
described
above, otheres change linear others exponetial ...
Could I still use lm for this kind of experiment? Would I've to decide
on a
gene by gene basis?
2. Some of the factors such as treament (T) for an experiment can only
take
say 2 distinct values: treated (t) and untreated (ut). Does a model
such as Y
~ T make any sense in this case?
Doesn't this assume a linear relationship between just 2 "clouds" of
data
(assume there are many samples for each factor level)? Even if one can
clearly distinguish between t and ut - assuming a straight line may
wrong.
This is like drawing a straight line between two points. Just like in
my
example above with the different doses, you may have already reached
some
kind of saturation. Using such a model for prediction would then give
wrong
results.
However, if one just wants to distinguish between t and ut, would the
lm be a
valid method?
I'm reading some "beginners" literature about lm's, and I'm just
trying to
understand what's going on ... .
Maybe you could comment on this. I'd be very interested in any
explanation or
clearification.
kind regards,
Arne
--
Arne Muller, Ph.D.
Toxicogenomics, Aventis Pharma
arne dot muller domain=aventis com

ADD COMMENT
• link
•
modified 15.6 years ago
•
written
15.6 years ago by
Arne.Muller@aventis.com •

**620**