I have the following experimental design:
- four developmental
stage
s, effectively a nominal categorical variable - three
condition
s: two treatments and one control
The biggest source of variation in the data is stage
. The two treatments are conceptually close to each other and the genes responding to them are largely the same, although: one of the treatments has a stronger effect (higher fold changes) and both treatments also elicit specific responses.
What I'm mainly interested in is genes responding to both treatments and genes responding to one but not the other, at each stage.
The approach I have been exploring is to unite stage
and condition
in a single variable group
and do Wald tests for contrasts treatment1_vs_control
and treatment2_vs_control
for each stage separately. It feels though that this way I lose power to discover genes which respond to one of the treatments significantly according to my predefined FDR and sub-significantly to the other treatment.
What is the best approach to my design? How to extract lists of genes responding to both treatments and only to each one of them at each stage? Thanks!
EDIT:
I think one sensible approach to extracting the shared genes (genes responding to both treatments) might be via LRT with reference level coding: the condition
is split into two binary factors corresponding to the two treatments:
col_data$T1 <- factor(ifelse(col_data$condition == "treatment1", "yes", "no"))
col_data$T2 <- factor(ifelse(col_data$condition == "treatment2", "yes", "no"))
# For each stage separately:
dds <- DESeqDataSetFromMatrix(counts_for_stage_A, col_data, ~ T1 + T2) %>%
DESeq(parallel = T, test = "LRT", reduced = ~ 1)
res <- results(dds, alpha = alpha_threshold)
Does this make sense?