Hi BioC community,
I have proteomic data for 2 conditions A and B. 15 patients are included in A group and 40 patients in B group. The proteome of each patient was measured at time 0 (before treatment) and time 22 (after treatment). So we study 110 samples, corresponding to 55 patients (2 samples by patient).
The questions are:
- What are the DE genes between A group and B group at time 0 (A.0 vs B.0)?
- What are the DE genes between A group time 22 and A group time 0 (A.22 vs A.0)?
- What are the DE genes between B group time 22 and B group time 0 (B.22 vs B.0)?
- What are the DE genes for the comparison (B.22 - B.0)-(A.22 - A.0) ?
This design is completely described in the paragraph “9.7 multi-level experiment” of limma user’s guide.
My problem is that the samples were processed in batches before I was in charge of this study, and I could just control the assignment to batches of the last half of the samples.
So the design generates something like that (to be clear, I just show some of the samples):
gp time patient batch 1 B t1 p1 B1 2 B t2 p1 B2 3 B t1 p2 B2 4 B t2 p2 B2 5 B t1 p3 B2 6 B t2 p3 B3 7 B t1 p4 B4 8 B t2 p4 B4 9 B t1 p5 B5 10 B t2 p5 B5 11 B t1 p6 B6 12 B t2 p6 B6 13 A t1 p7 B5 14 A t2 p7 B5 15 A t1 p8 B5 16 A t2 p8 B5 17 A t1 p9 B5 18 A t2 p9 B5 19 A t1 p10 B6 20 A t2 p10 B6 21 A t1 p11 B6 22 A t2 p11 B6 23 A t1 p12 B6 24 A t2 p12 B6
The last batches B5 and B6 have the same proportion B/A, here : 1/3 and 50% of T1, 50% of T2. However, some batches (like B1 and B3) have only one sample. The batch B2 have only B group.
I take into account the batch effect in my model :
Treat <- factor(paste(gp,time,sep="."))
design <- model.matrix(~0+batch+Treat)
Surprisingly, I do not got an error or warning message. Indeed, when you block by batch, you work within each batch to decrease variability linked to the batch, but how does it work when you have only one sample in the batch? Is this sample implicitly excluded from all the analyses? And when you have only B group in a batch? Are these samples excluded for the comparison (B.22 - B.0)-(A.22 - A.0) but included for the comparison B.22 vs B.0?
If they are not included in computations, how can you suggest to “save” these samples? Merge batches together?
Thanks in advance for your suggestions and help,