Entering edit mode
YUK FAI LEUNG
▴
140
@yuk-fai-leung-605
Last seen 10.2 years ago
Hi there,
I am going to do an affy experiment for the first time. I have a few
questions about the linear model design for my experiment.
I have a 3x2 factorial experiment. Three biological samples (wild type
whole animal (WA), wild type tissue (WT), mutant tissue (MT)) and two
time points (t1 and t2). The effects of interest are the mutant (M)
and
tissue (T) specific expression and their changes other time (Ti).
I suppose the model should have the following equations (error term
omitted) and I would have 3 affy biological replicates for each
condition.
WA.t1 = mu
WT.t1 = mu + T
MT.t1 = mu + T + M + T*M
WA.t2 = mu + Ti
WT.t2 = mu + T + Ti + T*Ti
MT.t2 = mu + T + M + Ti + T*M + M*Ti + T*Ti + T*M*Ti
My questions are:
1. How many degree of freedom do I have in the model? How do I
calculate
degree of freedom in linear model in general? For my case, is it 3
arrays * 6 conditions - 8 coefficients to be estimated = 10 degree of
freedom?
2. If I want to increase my degree of freedom, is it true that I can
do
it by increasing my replicate? If it is true, is there a difference
between repeating a sample with more coeffcients (e.g. MT.t2) and a
sample with less coefficients (e.g. WA.t1)? It seems to me having a
repeat with more coefficients is better off, but I don't know have to
stay it out statistically.
3. What is the formal way to determine whether an interaction term is
meaningful/significant in the model or not? Is it by the p-value? And
should I remove the term and fit the model (& again) if it is not
significant and deemed not important by biological knowledge? Or
should
I just fit the full model once and go ahead to interpret the contrasts
of interest? Is there a formal way (e.g. the diagnostics people use to
assess ANOVA models) for evaluating the quality of the whole fitted
model? Or I need not worry about this at all?
5. I have some confusion about the multiple hypothesis testing
adjustment for many contrasts. (I know I should better only use the
p-values/B/moderated t for ranking genes, but I am just curious to
know). For example in limma one would extract the contrast of interest
and list the candidate genes out on Toptable with the option = FDR
etc.,
but isn't it true that this is just the adjustment for that estimate?
When I evaluate all possible contrasts, how can I adjust the multiple
hypothesis testing for the genes in all the contrasts that I have
made?
6. A minor question. What does M & A in the Toptable of a
coefficient/contrast mean for affy data? If A stands the log2
intensity
estimate for that coefficient/contrast, is M the log2 ratio of (mu +
(coefficient or contrast estimate))/mu?
Thanks a lot for answering my questions. Any other advice for my
design
is also welcome.
Best regards,
Fai
--
Yuk Fai Leung
Department of Molecular and Cellular Biology
Harvard University
BL 2079, 16 Divinity Avenue
Cambridge, MA 02138
Tel: 617-495-2599
Fax: 617-496-3321
email: yfleung@mcb.harvard.edu; yfleung@genomicshome.com
URL: http://genomicshome.com