Hi!
I am using DESeq2 to look at differential expression of genes between patients and a healthy control group. However, a vast majority of the patients have received a drug treatment, only a few did not. Still, I would like to adjust for the effect of the treatment when comparing the patients with the controls. How could I achieve this in a proper way by the model design or using contrasts? To me it does not seem to make sense to add a treatment variable e.g, as I believe it will be confounded with the disease since no controls received the drug.
I would greatly appreciate any input!
Best of regards,
Oliver
Thank you for the valueful input Michael. I think all these points are valid.
However, I guess I should have phrased my question differently. It is of more 'theoretical' nature (i.e. assuming I have enough patients with/without treatment among my samples to differentiate between the gene set, i.e. receiving treatment and being a patient is not 'confounded' in this sense).
With a traditional 2x2 factorial experimental design, the two factors could be e.g 'disease' and 'treatment' with the levels 'yes' and 'no'. Then, to my understanding, I could pass the following design argument to DESeq to adjust for the treatment effect:
What I am confused about is how I should adjust for the treatment effect in this case, when only one level of the disease factor has received treatment (namely the patients). Could I use contrast somehow? Maybe there is another way or it is simply not possible? I am aware I cannot resolve the interaction effect of the treatment (since only the patients receive the treatment) but that is not of interest.
You can't "control for treatment" using linear models here. You don't know what the treatment naive state of the treated patients would have been (a counterfactual), without introducing strong assumptions. You inherently have three groups of samples here to compare.