DESeq2 Model matrix not full rank error despite fully-crossed experimental design
1
0
Entering edit mode
maherr • 0
@maherr-14877
Last seen 6.2 years ago

I have a phyloseq class with an otu_table, sample_data, tax_table, and phy_tree. I want to use the formula:

dds_int <- phyloseq_to_deseq2(qd, ~ temp:nutrient:corallivory)

When I run this, I get the error:

Error in checkFullRank(modelMatrix) : 
  the model matrix is not full rank, so the model cannot be fit as specified.
  One or more variables or interaction terms in the design formula are linear
  combinations of the others and must be removed.

 

However, after checking the model matrix vignette, I cannot find the issue. The three factors nutrient, temp, and corallivory are fully-crossed so there is no nesting in the design. There is also no co-linearity and no samples are missing. Nutrients has 3 levels, temp has 2 and corallivory has 2 making 12 combinations. Not all of the combinations have the same number of replicates (n of 4-6). But this should not affect the model. Am I missing something?

         X.SampleID tank nutrient    temp corallivory colony

m.ch.15     m.ch.15    D  Control Control     Control     C5

m.ch.11     m.ch.11    D  Control Control     Control     C1

m.ch.18     m.ch.18    L  Control Control     Control     C8

m.ch.17     m.ch.17    L  Control Control     Control     C7

m.ch.14     m.ch.14    D  Control Control     Control     C4

m.ch.99     m.ch.99    K     NH4+ Control     Control     C9

m.ch.95     m.ch.95    E     NH4+ Control     Control     C5

m.ch.91     m.ch.91    E     NH4+ Control     Control     C1

m.ch.97     m.ch.97    K     NH4+ Control     Control     C7

m.ch.94     m.ch.94    E     NH4+ Control     Control     C4

m.ch.98     m.ch.98    K     NH4+ Control     Control     C8

m.ch.55     m.ch.55    A     NO3- Control     Control     C5

m.ch.51     m.ch.51    A     NO3- Control     Control     C1

m.ch.57     m.ch.57    H     NO3- Control     Control     C7

m.ch.59     m.ch.59    H     NO3- Control     Control     C9

m.ch.58     m.ch.58    H     NO3- Control     Control     C8

m.ch.8       m.ch.8    L  Control Control     Scarred     C8

m.ch.7       m.ch.7    L  Control Control     Scarred     C7

m.ch.5       m.ch.5    D  Control Control     Scarred     C5

m.ch.4       m.ch.4    D  Control Control     Scarred     C4

m.ch.9       m.ch.9    L  Control Control     Scarred     C9

m.ch.1       m.ch.1    D  Control Control     Scarred     C1

m.ch.88     m.ch.88    K     NH4+ Control     Scarred     C8

m.ch.81     m.ch.81    E     NH4+ Control     Scarred     C1

m.ch.84     m.ch.84    E     NH4+ Control     Scarred     C4

m.ch.89     m.ch.89    K     NH4+ Control     Scarred     C9

m.ch.85     m.ch.85    E     NH4+ Control     Scarred     C5

m.ch.87     m.ch.87    K     NH4+ Control     Scarred     C7

m.ch.41     m.ch.41    A     NO3- Control     Scarred     C1

m.ch.44     m.ch.44    A     NO3- Control     Scarred     C4

m.ch.49     m.ch.49    H     NO3- Control     Scarred     C9

m.ch.48     m.ch.48    H     NO3- Control     Scarred     C8

m.ch.45     m.ch.45    A     NO3- Control     Scarred     C5

m.ch.47     m.ch.47    H     NO3- Control     Scarred     C7

m.ch.34     m.ch.34    F  Control    high     Control     C4

m.ch.31     m.ch.31    F  Control    high     Control     C1

m.ch.38     m.ch.38    G  Control    high     Control     C8

m.ch.37     m.ch.37    G  Control    high     Control     C7

m.ch.114   m.ch.114    C     NH4+    high     Control     C4

m.ch.119   m.ch.119    J     NH4+    high     Control     C9

m.ch.115   m.ch.115    C     NH4+    high     Control     C5

m.ch.118   m.ch.118    J     NH4+    high     Control     C8

m.ch.75     m.ch.75    B     NO3-    high     Control     C5

m.ch.74     m.ch.74    B     NO3-    high     Control     C4

m.ch.77     m.ch.77    I     NO3-    high     Control     C7

m.ch.79     m.ch.79    I     NO3-    high     Control     C9

m.ch.78     m.ch.78    I     NO3-    high     Control     C8

m.ch.24     m.ch.24    F  Control    high     Scarred     C4

m.ch.25     m.ch.25    F  Control    high     Scarred     C5

m.ch.28     m.ch.28    G  Control    high     Scarred     C8

m.ch.29     m.ch.29    G  Control    high     Scarred     C9

m.ch.27     m.ch.27    G  Control    high     Scarred     C7

m.ch.109   m.ch.109    J     NH4+    high     Scarred     C9

m.ch.105   m.ch.105    C     NH4+    high     Scarred     C5

m.ch.101   m.ch.101    C     NH4+    high     Scarred     C1

m.ch.104   m.ch.104    C     NH4+    high     Scarred     C4

m.ch.108   m.ch.108    J     NH4+    high     Scarred     C8

m.ch.61     m.ch.61    B     NO3-    high     Scarred     C1

m.ch.69     m.ch.69    I     NO3-    high     Scarred     C9

m.ch.64     m.ch.64    B     NO3-    high     Scarred     C4

m.ch.67     m.ch.67    I     NO3-    high     Scarred     C7

m.ch.65     m.ch.65    B     NO3-    high     Scarred     C5
deseq2 multiple factor design • 1.1k views
ADD COMMENT
0
Entering edit mode
@mikelove
Last seen 43 minutes ago
United States

Here you can use a simple example data frame to see what the design matrix looks like:

df <- data.frame(lapply(expand.grid(x=0:2,y=0:1,z=0:1), factor))
model.matrix(~x:y:z, df)

Stripping off the names:

> unname(model.matrix(~x:y:z, df))
      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13]
 [1,]    1    1    0    0    0    0    0    0    0     0     0     0     0
 [2,]    1    0    1    0    0    0    0    0    0     0     0     0     0
 [3,]    1    0    0    1    0    0    0    0    0     0     0     0     0
 [4,]    1    0    0    0    1    0    0    0    0     0     0     0     0
 [5,]    1    0    0    0    0    1    0    0    0     0     0     0     0
 [6,]    1    0    0    0    0    0    1    0    0     0     0     0     0
 [7,]    1    0    0    0    0    0    0    1    0     0     0     0     0
 [8,]    1    0    0    0    0    0    0    0    1     0     0     0     0
 [9,]    1    0    0    0    0    0    0    0    0     1     0     0     0
[10,]    1    0    0    0    0    0    0    0    0     0     1     0     0
[11,]    1    0    0    0    0    0    0    0    0     0     0     1     0
[12,]    1    0    0    0    0    0    0    0    0     0     0     0     1

Try instead ~x*y*z.

ADD COMMENT
0
Entering edit mode

Thank you Michael!

I had one more questions. When I run DESeq with the formula ~ temp*nutrient*corallivory I get the following results:

> resultsNames(dds_test_int)
 [1] "Intercept"                                "temp_high_vs_Control"                    
 [3] "nutrient_NH4._vs_Control"                 "nutrient_NO3._vs_Control"                
 [5] "corallivory_Scarred_vs_Control"           "temphigh.nutrientNH4."                   
 [7] "temphigh.nutrientNO3."                    "temphigh.corallivoryScarred"             
 [9] "nutrientNH4..corallivoryScarred"          "nutrientNO3..corallivoryScarred"         
[11] "temphigh.nutrientNH4..corallivoryScarred" "temphigh.nutrientNO3..corallivoryScarred"

While all the single factors are compared to the Control (i.e. temp_high_vs_Control), what are the interactions (temphigh.corallivoryScarred) being compared to? In other words, if i extract these results,

res_int = results(dds_test_int, name = "temphigh.corallivoryScarred", cooksCutoff = FALSE)

are the upregulated taxa (OTUs) upregulated compared to Controls (temp_Control, corallivory_Control) or to the rest of the combinations of temp/corallivory (temp_Control, corallivory_Control and temp_Control, corallivory_Scarred and temp_high, corallivory_Control)? My understanding is that all results are compared to the base level (temp_Control, corallivory_Control).

ADD REPLY
0
Entering edit mode

This is complex design, and you will need to understand the meaning of all the interaction terms. I don't have time to provide long form statistical support on the site right now, only software support for my packages. Of course I try to answer as much as I can in my off hours, but for this design you should really find a statistical collaborator who can interpret all the coefficients and help you build contrasts of interest. There is nothing specific to DESeq2 about the three-way interaction design and the meanings of the coefficients produced.

ADD REPLY

Login before adding your answer.

Traffic: 640 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6