I just wanted to check my understanding of the linear model used to build the design matrix for DESeq2 and was wondering if someone could tell me if all of this is correct. I have made a schematic of my understanding and what I think is going on below.
design ~ cell_type + treatment + cell_type:treatment
Line 1: intercept of model (baseline expression)
Line 2: effect of treatment in cell type A
Line 3: intrinsic difference in expression in cell type B vs. A
Line 4: effect of treatment in cell type B
Line 5: Interaction term (Line 4 - Line 2) is the difference in effectiveness of treatment in cell type B vs cell type A
So the linear model will be expression = beta1 + beta2*cell_type + beta3*treatment + beta4*interaction where cell_type, treatment, and interaction can take either 0 or 1.
If we first focus on cell type A, cell_type=0, interaction=0. For A control: expression = beta1 + beta2*0 + beta3*0 + beta4*0 = beta1 so in this case line 1. For A treatment: expression = beta1 + beta2*0 + beta3*1 + beta4*0 = beta1 + beta3 so in this case line 1 + line 2.
Looking at cell type B, cell_type=1. For B control: expression = beta1 + beta2*1 + beta3*0 + beta4*0 = beta1 + beta2 so in this case line 1 + line 3. For B treatment: expression = beta1 + beta2*1 + beta3*1 + beta4*1 = beta1 + beta2 + beta3 + beta4 so in this case line 1 + line 3 + line 4 (which is composed of line 2 + the extra interaction effect).
In what instances would you not include an interaction term and just have design ~ cell_type + treatment? When you think the treatment has the same effect in each cell type or when you don't care about cell type specific effects and are just interested in does treatment have any effect vs control?