Entering edit mode
Michael Rooney
▴
10
@michael-rooney-5335
Last seen 11.3 years ago
Hi,
I am trying to set up the correct formula for an analysis using the
package
limma. I have read the user guide, but I am still not sure if I am
approaching this the right way. Thanks in advance for your help.
I have microarray data for six patients. Four responded to treatment
and
two did not. For each patient I have a sample from before treatment
and a
sample from after treatment. Thus, twelve arrays total.
I have two objectives:
1) Identify genes differentially expressed between responders and
nonresponders before treatment
2) Identify genes that change (from before treatment to after
treatment)
differentially between responders and nonresponders
For objective one, my initial strategy was to:
1. ignore the post-treatment data
2. use the formula "~ isresponder"
3. and run toptable for isresponder
But then I wondered whether I was throwing information away about
variance
by ignoring the post-treatment data. Therefore I tried a new strategy:
1. keep all data
2. create a factor variable identifying my four types of patients
3. use the formula "~ 0 + responderpre + responderpost +
nonresponderpre +
nonresponderpost"
4. and run toptable for the contrast responderpost-responderpre
But then I wondered whether it was a problem that I was not accounting
for
the fact that there were pairs of samples in the analysis. I tried a
more
complicated modeling strategy, but on running eBayes I got the error
message "No residual degrees of freedom in linear model fits." Does
anyone
have suggestions?
For objective two, I am using the approach:
1. use all data
2. use the formula "~ isresponder*ispost"
3. run toptable on isresponder:ispost
Again, this ignores the fact that observations are paired. I tried
augmenting the formula to "~isresponder*ispost + patient_id", but this
returns "Coefficients not estimable." What is the appropriate modeling
strategy here?
Thanks,
Mike
[[alternative HTML version deleted]]
