Let's start with the easy one: Effect of treatment, accounting for the sample pairing:
~ Patient.ID + Treatment
Here, you fit an individual base line for each patient, so that patient-to-patient differences in expression before treatment are absorbed and you only look at the differences On-Pre.
Very important: You need to put a "factor(...)
" around the Patient.ID when creating the data frame. Otherwise, R will consider the patient IDs as numeric value, thinking that Patient 4 is expected to have 4 times the expression of something as Patient 1.
Also, you may want to use "relevel
" to make "Pre" the first and "On" the second factor. (Default is alphabetic ordering.) Otehrwise, all the signs come out the wrong way round.
Now, let's add the "Response" factor. You want to see how Treatment effect depends on Response, i.e., see an interaction between the two:
~ Patient.ID + Treatment * Response
Now, you get coefficients for patient base line and for treatment as before, but now also one for "Treatment:Response". The "Treatment" effect is now the effect of Treatment for non-responders ('non' because 'N' is the first level of Response), and "Treatment:Response" is the additional effect that comes on top of that if the patient is a responder.
R will also try to estimate a coefficient for "Response", i.e., base line differences between responders and non-responders. However, this will fail, because you already have individual base lines for each patient, and this absorbs all differences between patients, including those between reponders and non-responders. Therefore, you will get NAs for these ones.
If you want genes that react to treatment only in responders, you could find all genes with significant difference for Treatment:Response
and significant non-difference (using the 'altHypothesis="lessAbs"'
option of DESeq's result
function) for Treatment
. Or you simply look for significance in Treatment:Response
and so get all genes for which reaction to treatment differs according to Responder status.
Plotting the log2 fold changes to treatment for non-responders and responders against each other will also be helpful.
Finally, Status: To block for these ones, you could use
~ Patient.ID + Treatment * ( Status + Response )
However, Status is heavily confounded with Response. All naive patients are non-responders and all pre-treated patients are responders, except for Patient 5. Hence, for any difference you may find in the treatment response between patients 6-8 versus patients 1-4, there is no way of saying whether they are due to Responder or due to pre-treatment Status. This actually a very serious issue, compromising your entire study.
On the other hand: Patient 5 switches pre-treatment status from Pre to On. Are you sure your table is correct?
Hi Simon,
I modified the code to make up for the mistake you caught up.
Thanks a lot!
Again, to make it easier to follow here is the sample information (using your updated code). You should include this table in your post in future questions to Bioconductor to help people in understanding your experimental design.