Hi, I'm analyzing an RNA-Seq data set and I would like to perform a paired multifactor test that incorporates batch effect. When I use the design formula
~ response + response:PID + response:timepoint
then eliminate columns from the model matrix manually that contain only zeroes, DESeq2 will run the analysis as expected. However, when I change the design to
~ response + response:PID + response:timepoint + batch
I get the familiar error message
Error in checkFullRank(full) : the model matrix is not full rank, so the model cannot be fit as specified. One or more variables or interaction terms in the design formula are linear combinations of the others and must be removed.
This is what my colData(dds) looks like
sample PID timepoint batch response 01-0603_on P01 fiftysix one PD 01-0603_pre P01 zero one PD 03-0601_on P01 fiftysix one PR 03-0601_pre P01 zero one PR 03-0606_on P02 fiftysix one PR 03-0606_pre P02 zero one PR 04-0605_on P03 fiftysix one PR 04-0605_pre P03 zero one PR 04-0632_on P02 fiftysix three PD 04-0632_pre P02 zero three PD 05-0614_on P04 fiftysix three PR 05-0614_pre P04 zero three PR 05-0621_on P03 fiftysix three PD 05-0621_pre P03 zero three PD 05-0624_on P05 fiftysix three PR 05-0624_pre P05 zero three PR 08-0631_on P04 fiftysix three PD 08-0631_pre P04 zero three PD
I realize that my samples aren't well balanced between batches (only one PD response in batch one), but was under the impression that I could still incorporate batch in the design formula in this case. Could you please tell me what I'm missing here and if there is a way to incorporate batch into my analysis?