Question: DESeq2 Model matrix not full rank error despite fully-crossed experimental design
0
gravatar for maherr
22 months ago by
maherr0
maherr0 wrote:

I have a phyloseq class with an otu_table, sample_data, tax_table, and phy_tree. I want to use the formula:

dds_int <- phyloseq_to_deseq2(qd, ~ temp:nutrient:corallivory)

When I run this, I get the error:

Error in checkFullRank(modelMatrix) : 
  the model matrix is not full rank, so the model cannot be fit as specified.
  One or more variables or interaction terms in the design formula are linear
  combinations of the others and must be removed.

 

However, after checking the model matrix vignette, I cannot find the issue. The three factors nutrient, temp, and corallivory are fully-crossed so there is no nesting in the design. There is also no co-linearity and no samples are missing. Nutrients has 3 levels, temp has 2 and corallivory has 2 making 12 combinations. Not all of the combinations have the same number of replicates (n of 4-6). But this should not affect the model. Am I missing something?

         X.SampleID tank nutrient    temp corallivory colony

m.ch.15     m.ch.15    D  Control Control     Control     C5

m.ch.11     m.ch.11    D  Control Control     Control     C1

m.ch.18     m.ch.18    L  Control Control     Control     C8

m.ch.17     m.ch.17    L  Control Control     Control     C7

m.ch.14     m.ch.14    D  Control Control     Control     C4

m.ch.99     m.ch.99    K     NH4+ Control     Control     C9

m.ch.95     m.ch.95    E     NH4+ Control     Control     C5

m.ch.91     m.ch.91    E     NH4+ Control     Control     C1

m.ch.97     m.ch.97    K     NH4+ Control     Control     C7

m.ch.94     m.ch.94    E     NH4+ Control     Control     C4

m.ch.98     m.ch.98    K     NH4+ Control     Control     C8

m.ch.55     m.ch.55    A     NO3- Control     Control     C5

m.ch.51     m.ch.51    A     NO3- Control     Control     C1

m.ch.57     m.ch.57    H     NO3- Control     Control     C7

m.ch.59     m.ch.59    H     NO3- Control     Control     C9

m.ch.58     m.ch.58    H     NO3- Control     Control     C8

m.ch.8       m.ch.8    L  Control Control     Scarred     C8

m.ch.7       m.ch.7    L  Control Control     Scarred     C7

m.ch.5       m.ch.5    D  Control Control     Scarred     C5

m.ch.4       m.ch.4    D  Control Control     Scarred     C4

m.ch.9       m.ch.9    L  Control Control     Scarred     C9

m.ch.1       m.ch.1    D  Control Control     Scarred     C1

m.ch.88     m.ch.88    K     NH4+ Control     Scarred     C8

m.ch.81     m.ch.81    E     NH4+ Control     Scarred     C1

m.ch.84     m.ch.84    E     NH4+ Control     Scarred     C4

m.ch.89     m.ch.89    K     NH4+ Control     Scarred     C9

m.ch.85     m.ch.85    E     NH4+ Control     Scarred     C5

m.ch.87     m.ch.87    K     NH4+ Control     Scarred     C7

m.ch.41     m.ch.41    A     NO3- Control     Scarred     C1

m.ch.44     m.ch.44    A     NO3- Control     Scarred     C4

m.ch.49     m.ch.49    H     NO3- Control     Scarred     C9

m.ch.48     m.ch.48    H     NO3- Control     Scarred     C8

m.ch.45     m.ch.45    A     NO3- Control     Scarred     C5

m.ch.47     m.ch.47    H     NO3- Control     Scarred     C7

m.ch.34     m.ch.34    F  Control    high     Control     C4

m.ch.31     m.ch.31    F  Control    high     Control     C1

m.ch.38     m.ch.38    G  Control    high     Control     C8

m.ch.37     m.ch.37    G  Control    high     Control     C7

m.ch.114   m.ch.114    C     NH4+    high     Control     C4

m.ch.119   m.ch.119    J     NH4+    high     Control     C9

m.ch.115   m.ch.115    C     NH4+    high     Control     C5

m.ch.118   m.ch.118    J     NH4+    high     Control     C8

m.ch.75     m.ch.75    B     NO3-    high     Control     C5

m.ch.74     m.ch.74    B     NO3-    high     Control     C4

m.ch.77     m.ch.77    I     NO3-    high     Control     C7

m.ch.79     m.ch.79    I     NO3-    high     Control     C9

m.ch.78     m.ch.78    I     NO3-    high     Control     C8

m.ch.24     m.ch.24    F  Control    high     Scarred     C4

m.ch.25     m.ch.25    F  Control    high     Scarred     C5

m.ch.28     m.ch.28    G  Control    high     Scarred     C8

m.ch.29     m.ch.29    G  Control    high     Scarred     C9

m.ch.27     m.ch.27    G  Control    high     Scarred     C7

m.ch.109   m.ch.109    J     NH4+    high     Scarred     C9

m.ch.105   m.ch.105    C     NH4+    high     Scarred     C5

m.ch.101   m.ch.101    C     NH4+    high     Scarred     C1

m.ch.104   m.ch.104    C     NH4+    high     Scarred     C4

m.ch.108   m.ch.108    J     NH4+    high     Scarred     C8

m.ch.61     m.ch.61    B     NO3-    high     Scarred     C1

m.ch.69     m.ch.69    I     NO3-    high     Scarred     C9

m.ch.64     m.ch.64    B     NO3-    high     Scarred     C4

m.ch.67     m.ch.67    I     NO3-    high     Scarred     C7

m.ch.65     m.ch.65    B     NO3-    high     Scarred     C5
ADD COMMENTlink modified 22 months ago by Michael Love26k • written 22 months ago by maherr0
Answer: DESeq2 Model matrix not full rank error despite fully-crossed experimental desig
0
gravatar for Michael Love
22 months ago by
Michael Love26k
United States
Michael Love26k wrote:

Here you can use a simple example data frame to see what the design matrix looks like:

df <- data.frame(lapply(expand.grid(x=0:2,y=0:1,z=0:1), factor))
model.matrix(~x:y:z, df)

Stripping off the names:

> unname(model.matrix(~x:y:z, df))
      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13]
 [1,]    1    1    0    0    0    0    0    0    0     0     0     0     0
 [2,]    1    0    1    0    0    0    0    0    0     0     0     0     0
 [3,]    1    0    0    1    0    0    0    0    0     0     0     0     0
 [4,]    1    0    0    0    1    0    0    0    0     0     0     0     0
 [5,]    1    0    0    0    0    1    0    0    0     0     0     0     0
 [6,]    1    0    0    0    0    0    1    0    0     0     0     0     0
 [7,]    1    0    0    0    0    0    0    1    0     0     0     0     0
 [8,]    1    0    0    0    0    0    0    0    1     0     0     0     0
 [9,]    1    0    0    0    0    0    0    0    0     1     0     0     0
[10,]    1    0    0    0    0    0    0    0    0     0     1     0     0
[11,]    1    0    0    0    0    0    0    0    0     0     0     1     0
[12,]    1    0    0    0    0    0    0    0    0     0     0     0     1

Try instead ~x*y*z.

ADD COMMENTlink written 22 months ago by Michael Love26k

Thank you Michael!

I had one more questions. When I run DESeq with the formula ~ temp*nutrient*corallivory I get the following results:

> resultsNames(dds_test_int)
 [1] "Intercept"                                "temp_high_vs_Control"                    
 [3] "nutrient_NH4._vs_Control"                 "nutrient_NO3._vs_Control"                
 [5] "corallivory_Scarred_vs_Control"           "temphigh.nutrientNH4."                   
 [7] "temphigh.nutrientNO3."                    "temphigh.corallivoryScarred"             
 [9] "nutrientNH4..corallivoryScarred"          "nutrientNO3..corallivoryScarred"         
[11] "temphigh.nutrientNH4..corallivoryScarred" "temphigh.nutrientNO3..corallivoryScarred"

While all the single factors are compared to the Control (i.e. temp_high_vs_Control), what are the interactions (temphigh.corallivoryScarred) being compared to? In other words, if i extract these results,

res_int = results(dds_test_int, name = "temphigh.corallivoryScarred", cooksCutoff = FALSE)

are the upregulated taxa (OTUs) upregulated compared to Controls (temp_Control, corallivory_Control) or to the rest of the combinations of temp/corallivory (temp_Control, corallivory_Control and temp_Control, corallivory_Scarred and temp_high, corallivory_Control)? My understanding is that all results are compared to the base level (temp_Control, corallivory_Control).

ADD REPLYlink written 21 months ago by maherr0

This is complex design, and you will need to understand the meaning of all the interaction terms. I don't have time to provide long form statistical support on the site right now, only software support for my packages. Of course I try to answer as much as I can in my off hours, but for this design you should really find a statistical collaborator who can interpret all the coefficients and help you build contrasts of interest. There is nothing specific to DESeq2 about the three-way interaction design and the meanings of the coefficients produced.

ADD REPLYlink written 21 months ago by Michael Love26k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 228 users visited in the last hour