Entering edit mode
Guest User
★
13k
@guest-user-4897
Last seen 10.3 years ago
Dear all,
I am working on expression classifiers for leukemic subtypes using
Affymetrix Plus2 arrays. The training data consists of several
batches. The developed classifier will be used to predict the subtype
of new sets of samples as well as single samples. So far, I co-
normalized new arrays with the training set, but this is not ideal.
I have read the frma paper by McCall et al, and it seems the perfect
solutions. Before I start, I have a few conceptual questions:
1. The training data consists of several batches of different sizes,
some of them biased towards a single subtype. Does normalization per
batch using summarize=???random_effect??? remove biology in this case?
ComBat clearly did, and I ended up not correcting for batch effect,
which worked fine for the classifiers I am using. Any suggestion which
summarization would be best to use in this case?
2. Is there a minimum of arrays to use with
summarize=???random_effect????
Any suggestions on how to best implement frma in this project are very
welcome!
Cheers, Judith
-- output of sessionInfo():
R version 2.15.2 (2012-10-26)
Platform: i386-w64-mingw32/i386 (32-bit)
locale:
[1] LC_COLLATE=English_United Kingdom.1252 LC_CTYPE=English_United
Kingdom.1252
[3] LC_MONETARY=English_United Kingdom.1252 LC_NUMERIC=C
[5] LC_TIME=English_United Kingdom.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
--
Sent via the guest posting facility at bioconductor.org.