Dear DESeq2 users,
I have a couple of questions regarding study design. They are probably really basic but such analysis is new for me.
Let's suppose I have got some RNASeq data from three independent biological experiments without technical replicates of the same cell line that was cultured at sparse (S) and dense (D) condition. These differentially cultured cells were treated with different siRNA (two types of siRNA per gene) and some inducers. Moreover, I expect some batch effect on my results. I understand that it may sound a bit complicated, thus, I am posting a simplified study design:
name | batch | condition |
1_D_NT | 1 | D_NT |
1_D_siCtrl_1 | 1 | D_siCtrl_1 |
1_D_siCtrl_2 | 1 | D_siCtrl_2 |
1_D_siTarget_1 | 1 | D_siTarget_1 |
1_D_siTarget_2 | 1 | D_siTarget_2 |
1_D_Inducer | 1 | D_Inducer |
1_S_NT | 1 | S_NT |
1_S_siCtrl_1 | 1 | S_siCtrl_1 |
1_S_siCtrl_2 | 1 | S_siCtrl_2 |
1_S_siTarget_1 | 1 | S_siTarget_1 |
1_S_siTarget_2 | 1 | S_siTarget_2 |
1_S_Inducer | 1 | S_Inducer |
2_D_NT | 2 | D_NT |
2_D_siCtrl_1 | 2 | D_siCtrl_1 |
2_D_siCtrl_2 | 2 | D_siCtrl_2 |
2_D_siTarget_1 | 2 | D_siTarget_1 |
2_D_siTarget_2 | 2 | D_siTarget_2 |
2_D_Inducer | 2 | D_Inducer |
2_S_NT | 2 | S_NT |
2_S_siCtrl_1 | 2 | S_siCtrl_1 |
2_S_siCtrl_2 | 2 | S_siCtrl_2 |
2_S_siTarget_1 | 2 | S_siTarget_1 |
2_S_siTarget_2 | 2 | S_siTarget_2 |
2_S_Inducer | 2 | S_Inducer |
... |
Initially I wrote down the following code:
dds <- DESeqDataSetFromMatrix(countData = cts, colData = coldata, design = ~ batch + condition)
but I am now wondering whether and how I could enhance the possibility of detecting differentially expressed genes between treatments and the same treatment from two confluence states by modifying the colData file?
Do you think something like this might work:
dds <- DESeqDataSetFromMatrix(countData = cts, colData = coldata, design = ~ batch + confluence + condition)
using the following coldata file:
name | batch | confluence | condition |
1_D_NT | 1 | D | NT |
1_D_siCtrl_1 | 1 | D | siCtrl_1 |
1_D_siCtrl_2 | 1 | D | siCtrl_2 |
1_D_siTarget_1 | 1 | D | siTarget_1 |
1_D_siTarget_2 | 1 | D | siTarget_2 |
1_D_Inducer | 1 | D | Inducer |
1_S_NT | 1 | S | NT |
1_S_siCtrl_1 | 1 | S | siCtrl_1 |
1_S_siCtrl_2 | 1 | S | siCtrl_2 |
1_S_siTarget_1 | 1 | S | siTarget_1 |
1_S_siTarget_2 | 1 | S | siTarget_2 |
1_S_Inducer | 1 | S | Inducer |
2_D_NT | 2 | D | NT |
2_D_siCtrl_1 | 2 | D | siCtrl_1 |
2_D_siCtrl_2 | 2 | D | siCtrl_2 |
2_D_siTarget_1 | 2 | D | siTarget_1 |
2_D_siTarget_2 | 2 | D | siTarget_2 |
2_D_Inducer | 2 | D | Inducer |
2_S_NT | 2 | S | NT |
2_S_siCtrl_1 | 2 | S | siCtrl_1 |
2_S_siCtrl_2 | 2 | S | siCtrl_2 |
2_S_siTarget_1 | 2 | S | siTarget_1 |
2_S_siTarget_2 | 2 | S | siTarget_2 |
2_S_Inducer | 2 | S | Inducer |
...
I am also wondering on constructing a model matrix but I am not sure how it should look like.
Effectively, I would like to run the results function to get the desired comparisons, such as:
results(dds, contrast = list("D.NT", "S.NT")
so batch effect would be already included.
Many thanks for your help and feedback!
All the best,
Krzysztof