Entering edit mode
Hi BioC List,
I'm working with an affymetrix data set where the batches are
completely confounded with the factor of interest for one contrast,
treatment time. From my understanding I can use something like fRMA
to partially mitigate the effect, but otherwise not much I can do.
However, we do have the original tissue samples and the option to re-
extract/reprocess some samples in a new batch. Due to the study size,
rerunning all samples with a proper randomized design is out of the
question. Are there any studies describing how we might rerun a small
subset of samples to recover the contrast of interest? Or does anyone
have any advice? For example, can I run 10 samples from each time
point in a new batch - could that be sufficient? Clearly it could
depend on the size of the batch effect, so I understand that there is
probably no definitive answer...
Here's the sample breakdown with the relevant treatments/covariates -
you can see that timepoints t1 and t2 are completely confounded with
batch, while timepoints t2 and t3 are not. I also included another
covariate that we want to account for, Ethnicity, and the number of
samples in each group:
Batch Timepoint Ethnicity Num_Samples
b1 t1 A 21
b1 t1 B 54
b2 t1 A 20
b2 t1 B 56
b3 t2 A 10
b3 t2 B 35
b3 t3 A 9
b3 t3 B 38
b4 t2 B 49
b4 t3 A 1
b4 t3 B 43
b5 t2 A 28
b5 t2 B 4
b5 t3 A 25
b5 t3 B 21
Here's the setup that I would like (note that I'm also including
patientID in the design matrix since there are many patients that have
a sample at multiple time points):
design <- model.matrix(~ -1 + Timepoint + Ethnicity + Batch +
PatientID)
fit <- lmFit(eset, design)
contrast.matrix <- makeContrasts(Timepointt2 - Timepointt1,
levels=design)
fit2 <- contrasts.fit(fit, contrast.matrix)
fit <- eBayes(fit)
If we do reprocess samples, do I need to ensure even representation of
the two ethnicities in the new batch?
Thanks in advance for the guidance!
Ty
[[alternative HTML version deleted]]