I have been searching a looot about it, apologies if I have missed a solution.
I am trying to do a paired analysis in DESeq2, meaning having a paired samples design.
I have a dataset with two groups (which is my batch), each group has uneven number of subjects, and each subject can only be in one of the groups and have one or two samples (so one or both conditions). Note that my counts are estimated with kallisto. Just a toy example of how my colData look like:
sample batch condition subject 1 sample1 1 A S1 2 sample2 1 B S1 3 sample3 1 A S2 4 sample4 1 B S2 5 sample5 1 A S3 6 sample6 1 B S3 7 sample7 1 A S4 8 sample8 1 B S4 9 sample9 1 A S5 10 sample10 1 B S5 11 sample11 2 A S6 12 sample12 2 B S6 13 sample13 2 A S7 14 sample14 2 B S7 15 sample15 2 A S8
In this example, batch == 1 has 5 subjects with both conditions per subject, while batch == 2 has 3 subjects, one of which has only one condition. I simplified with keeping balanced paired samples with respect to the condition, so I filtered out sample15.
My goal is to test the condition effect while controlling for subject effects.
So initially I thought my model would be ~ batch + subject + condition. And the resultName I would look at to see the condition effect (while having controlled for batch and subject effects) is the 'condition_B_vs_A'. This model design leads to the "Model matrix not full rank" error.
dds = DESeqDataSetFromMatrix(countData = counts.mat, colData = sample.summary.balanced, design = ~ batch + subject + condition) dds <- DESeq(dds) Error in checkFullRank(modelMatrix) : the model matrix is not full rank, so the model cannot be fit as specified. One or more variables or interaction terms in the design formula are linear combinations of the others and must be removed.
The problem is the linearity of batch and subject. The examples of linearity in the vignette do not exactly match the case here to my understanding.
While I have tried a few thoughts motivated of some conversations here (like my last resort was applying batch correction first, converting to positive integers and then run DESeq2 with ~ subject + condition, which gave zero DEGs), nothing I have tried works.
By the way, if I don't do a paired design and just have ~batch+condition the model works nicely. But I would like to take advantage of the fact that I have both conditions per subject.
Any insight would be greatly appreciated!