In my RNA-seq experiment, I have subjects at two doses (high and low) and three time-points (pre-treatment, after treatment 1, after treatment 2). I want a subject-level baseline to account for subject-to-subject variability.
Thus, my design matrix has three columns:
subject. The design i want is essentially
dose*time + subject. In practice, what I've done is concatenated dose and time into a single factor with levels like
dose0_time0, dose1_time2; now my design table has two columns, this
catfactor, and the same
subject as before. At this point you may have noticed my problem: the
subject column of my matrix is perfectly confounded by
catfactor. The two columns being independent, I get an error when I try and execute
DESeqDataSetFromMatrix(counts, designMat, design= ~catfactor + subject).
Love and colleagues have anticipated my difficulty in their user guide and describe a workaround in the subsection, "Group-specific condition effects, individuals nested within groups": https://www.bioconductor.org/packages/3.5/bioc/vignettes/DESeq2/inst/doc/DESeq2.html#group-specific-condition-effects-individuals-nested-within-groups
As I understand it, the method they propose is to add a dummy variable that indicates the "within-group subject ID." In this scheme, one subject in the high dose cohort has the same "within-group ID" as another subject in the low-dose cohort. I then use
model.matrix(~ dose + dose:ind.n + dose:time). (For added fun, the number of subjects in each dose group are not the same resulting in levels with no samples, but here the instructions in the user guide seem tractable.)
My trouble is that I now I want to pull a non-trivial contrast out of my fit, and the
resultsNames I get are
The main contrast I want is to find what changes before and after treatment in the high-dose cohort that are different from the low dose cohort.
In the absence of the subject factor, I can do this. First, a contrast comparing pre- and post treatment in high dose is given by:
(dose1_time1 + dose1_time2)/2 - dose1_time0. Now, to compare the pre- to post- change in high dose vs. low dose, I write that contrast as
(dose1_time1 + dose1_time2)/2 - dose1_time0 - ((dose0_time1 + dose0_time2)/2 - dose0_time0)
This can be input into
results by giving a
vector value to the
However, given the inscrutable parameters introduced by the workaround, I am at a loss.