DESeq2 - how to account for patient, batch and condition
1
0
Entering edit mode
S • 0
@399a8e69
Last seen 1 day ago
Spain

Hello,

I would like to compare samples in one condition versus the other condition (Organoid Vs Tissue) in the following dataset.


Sample    Patient  Replicate  Batch  Condition
O3_BR1         3         1  run2    Organoid
O3_BR2         3         2  run2    Organoid
O3_BR3         3         3  run2    Organoid
O4_BR1         4         1  run2    Organoid
O4_BR2         4         2  run2    Organoid
O4_BR3         4         3  run2    Organoid
O5_BR1         5         1  run1    Organoid
O5_BR2         5         2  run1    Organoid
O5_BR3         5         3  run1    Organoid
O5_T_BR1       5         1  run1    Organoid
O5_T_BR2       5         2  run1    Organoid
O6_BR1         6         1  run2    Organoid
O6_BR2         6         2  run2    Organoid
O6_BR3         6         3  run2    Organoid
O8_BR1         8         1  run1    Organoid
O8_BR2         8         2  run1    Organoid
O9_BR1         9         1  run1    Organoid
O9_BR2         9         2  run1    Organoid
O9_BR3         9         2  run3    Organoid
T3_BR1         3         1  run2      Tissue
T4_BR1         4         1  run2      Tissue
T5_BR1         5         1  run3      Tissue
T6_BR1         6         1  run2      Tissue
T8_BR1         8         1  run1      Tissue
T9_BR1         9         1  run1      Tissue


I have paired samples, each of them with biological replicates in one group and no biological replicates in the other.

The Patient variable is nested with the Batch variable. I've seen the "Group-specific condition effects, individuals nested within groups" in the vignette but I'm unsure whether that approach can be applied in my case and, in case it is possible to apply it, I also doubt about how to generate the additional column (ind.n in the provided example) in my dataset.

DESeq2 • 85 views
0
Entering edit mode
@james-w-macdonald-5106
Last seen 1 day ago
United States

Patient isn't nested in batch. Both patients 5 and 9 are in runs 1 and 3.

> table(d.f[,c(2,4)])
Batch
Patient run1 run2 run3
3    0    4    0
4    0    4    0
5    5    0    1
6    0    4    0
8    3    0    0
9    3    0    1


But anyway, you can just block on patient, contingent upon the batches not being too different. With the patients being mostly nested in batch you won't be able to block on both, but you could block on batch but not patient. You should do a PCA plot to see what's driving differences and use that to decide what to do. The remaining option would be to use the limma-voom pipeline and block on batch with a random intercept for patient. But you probably don't need anything that fancy.