I am working with a complex experiment design shown below and I am not able to build the appropriate design matrix. I looked through the DESeq2 manual and this blog and tried different designs and every time I am getting error as "Design matrix not of full rank".
Patients highlighted in orange are unpaired (i.e. have only pre-treatment data). I also introduced ind.n column as suggested by DESeq2 manual but may not be necessary in the design.
I am interested in DE testing for:
- ResistantvsSensitive considering the patients effect
- PostvsPre considering the patients effect (All samples)
- PostvsPre considering the patients effect (within Resistant group)
- PostvsPre considering the patients effect (within Sensitive group)
Can you please suggest the best possible design matrix?
Also, a very naive question - I have seen design examples starting with "~1", "~0 + ". What is the difference between the two? I found the explanations for - 1. Dependent and independent variables represented on either side of ~ symbol 2. Interaction are represented with : symbol. But I am unsure of "~1" and "~0 +" designs.
What he said. For limma, use
duplicateCorrelation
while blocking onPatient
, and use a one-way layout in your design matrix with a combined factor containingResponse
andTreatment
. For edgeR, you would need to do different things depending on your contrast:pre
samples and then use a one-way layout onResponse
to test for differences withinpre
. Repeat forpost
.~Patient + Response:Treatment
(probably need to drop a few terms containingTreatmentpre
in their names to get to full rank). This will give you two terms at the right of the matrix, representing the post/pre effect in each response type. Test that the average of this response is non-zero.In 2-4, the unpaired samples effectively don't contribute anything to the analysis - their only effect is to contribute to the calculation of the average log-CPM - so you might as well drop them.
Dear Micheal and Aaron,
Thank you for the response. I am starting with DESeq2 and next work with limma.
I removed the unpaired samples and then able to get it running correctly. I am posting the code below as it may help someone.
Can you please confirm that my comparison interpretations below are correct as deduced from the resultsNames?
Next, I also want to compare Post vs Pre considering patients effect. I think, I need to build another model matrix with adding new pt.n column numbered as per the Treatment. Is this correct?
Thank you again for the help.
Rather than continuing to send alerts to the limma and edgeR maintainers, for additional questions specific to DESeq2 could you make a new post?