I want to make sure I am doing things correctly on DESeq2.
Below is my design, it comprises 33 samples with 9 initial conditions. Associated with these conditions are some other factors such as age (old or young), further more, these 9 conditions can also be broken down into types (queen, foundress, guard, worker, and gyne).
I am interested in seeing pairwise differences between the 9 conditions, as well as the differences between age or state, however, I am unsure if I should take into account that extra information associated with the conditions (age, state. I thought I would have to, but based on the diversity of conditions, I am beginning to wonder if it is necessary or not. Could anyone let me know? Also, what would be the next step in accounting for everything in the design parameters themselves?
expt_design <- data.frame(samples = colnames(total_counts), condition = c("AB", "AB", "AB", "FN", "FN", "FN", "FB", "FB", "FB", "GD", "GD", "GD", "GD", "FD", "FD", "FD", "FD", "FM", "FM", "FM", "FM", "FM", "GM", "GM", "GM", "GM", "GM", "MOM_MB", "MOM_MB", "MOM_MB", "D_MB", "D_MB", "D_MB"), age = c("old", "old", "old", "old", "old", "old", "old", "old", "old", "young", "young", "young", "young", "young", "young", "young", "young", "old", "old", "old", "old", "old", "old", "old", "old", "old", "old", "old", "old", "old", "young", "young", "young")) type = c("queen", "queen", "queen", "foundress", "foundress", "foundress", "queen", "queen", "queen", "guard", "guard", "guard", "guard", "worker", "worker", "worker", "worker", "queen", "queen", "queen", "queen", "queen", "queen", "queen", "queen", "queen", "queen", "queen", "queen", "queen", "gyne", "gyne", "gyne")) ### dds <- DESeqDataSetFromMatrix( countData = total_counts, colData = expt_design, design = ~ condition + ? ? ? )
Thanks,
Mike
So I receive this error when doing ~age + type + condition.
Error in checkFullRank(modelMatrix) :
the model matrix is not full rank, so the model cannot be fit as specified.
One or more variables or interaction terms in the design formula are linear
combinations of the others and must be removed.
Please read the vignette section 'Model matrix not full rank':
vignette('DESeq2')
I think I understand the problem, and I think it is fundamental in my design. Can I not factor in age, or type, since they basically describe the condition as well? Can I only do contrasts between conditions, and combinations of conditions against each other?
Thanks
I didn't compute the confounding in the design above, but yes, if the variables are confounded, you can only add in variables to the design which are "linearly independent". Basically this means, variables which separate the samples in other ways than the variables already in the design.