Hi, we are now trying to analyzing a dataset, with 3 cell lines, each treated with low pH and normal pH. We have three replicates for each group, for some reason we have to treat them as technical replicates. The goal is to obtain differentially expressed genes at low pH shared by all three cell lines.
names cell pH
Sample1 MDA low
Sample2 MDA low
Sample3 MDA low
Sample4 MDA normal
Sample5 MDA normal
Sample6 MDA normal
Sample7 Panc low
Sample8 Panc low
Sample9 Panc low
Sample10 Panc normal
Sample11 Panc normal
Sample12 Panc normal
Sample13 MCF7 low
Sample14 MCF7 low
Sample15 MCF7 low
Sample16 MCF7 normal
Sample17 MCF7 normal
Sample18 MCF7 normal
We want to use the function duplicateCorrelation() to address technical replicates, which needs an appropriate block vector. We are now wondering what's the difference between the following two ways for modelling:
model.matrix(~pH)
with block vector:
MDA MDA MDA MDA MDA MDA Panc Panc Panc Panc Panc Panc MCF MCF MCF MCF MCF MCF
and
model.matrix(~cell+pH)
with block vector:
MDA1 MDA1 MDA1 MDA2 MDA2 MDA2
Panc1 Panc1 Panc1 Panc2 Panc2 Panc2
MCF1 MCF1 MCF1 MCF2 MCF2 MCF2
Thanks in advance