Entering edit mode
Max Kauer
▴
140
@max-kauer-2254
Last seen 8.1 years ago
Dear List,
I want to compare a tumor cell population to different normal cell
populations on affy chips from different batches (studies)
The problem is that the tumor cell population and one of the normal
cell populations come from one batch, whereas the other cell
populations come from two other studies. In principle the data looks
like this (there are different numbers of replicas of the different
cell populations, so X stands for anything between 3 and 5):
batch 1: X repl. of tumor cells
X repl. of cellpop A
batch 2: X repl of cellpop B
X repl of cellpop C
X repl of cellpop D
batch 3: X repl of cellpop E
X repl of cellpop F
X repl of cellpop G
The question is - which of the cellpops is most similar to the tumor
cell pop.
and of course, without correction of the batch effect, the batches
cluster together.
Now if I take out the batch effect (I used e.g. the ComBat script of
Cheng Li and W.E. Johnson, Biostatistics (2007), 8,1: 118-127) the
batches do not cluster together any more, so tumor cells do not
cluster with cellpop A any more, but I think that with taking out the
batch effect in such an unbalanced "design" I would also take out some
of the real biological differences and similarities, especially
because the cell populations within batch 2 and 3 are probably more
similar to each other than accross studies (batches).
So my simple questions are:
Is it sensible at all to do that kind of a comparison?
and what would be the most appropriate method? - I know that there are
different packages and methods (e.g. metaArray, rankProd) but I would
like to get an opinion of what, if any, would be most appropriate in
my special case.
Thanks!
Max
--------------------------------------
Maximilian Kauer
CCRI - Children's Cancer Research Institute
Kinderspitalgasse 6
1090 Vienna Austria