Question: DESeq2: how to construct the design matrix?
gravatar for taf
4.1 years ago by
taf0 wrote:


I have some questions concerning the construction of design matrices within the DESeq2 package.\\
I constructed the following design matrix, which contains 3 different treatments (T1,T2,T3) in duplicates and 4 controls (ctrl).


designMat <- rbind( c(1,0,0,0), c(1,0,0,0), c(0,1,0,0), c(0,1,0,0), c(0,0,1,0), c(0,0,1,0), c(0,0,0,1), c(0,0,0,1), c(0,0,0,1), c(0,0,0,1) ) 
rownames(designMat) <- 1:10 
colnames(designMat) <- c("T1","T2","T3","ctrl") 


I know that it is easily possible to convert it into a "colData" data frame, which DESeq2 "understands".:

colData <- mat.or.vec(nrow(designMat), 1)
colData <-
colnames(colData) <- "condition"
for(i in 1:nrow(designMat))
    colData[i,"condition"] <- colnames(designMat)[which(designMat[i,]==1)]


And, indeed, if I apply the DESeq work flow to some sampled data and I have a look at the model matrix of the DESeq2 object I get "my" design matrix.



lambda0 <- 1000
T1 <- matrix(rpois(2000,lambda=lambda0), ncol=2)
T2 <- matrix(rpois(2000,lambda=2*lambda0), ncol=2)
T3 <-  matrix(rpois(2000,lambda=4*lambda0), ncol=2)
ctrl <-  matrix(rpois(4000,lambda=0.5*lambda0), ncol=4)

data <- cbind(T1,T2,T3,ctrl)
colnames(data) <- c(rep("T1",2),rep("T2",2),rep("T3",2),rep("ctrl",4))

## create DESeq data sets and check the model matrices
dds1 <- DESeqDataSetFromMatrix(countData = data, colData =colData , design =  ~ condition  ) 
dds1 <- estimateSizeFactors(dds1)
dds1 <- estimateDispersions(dds1)
dds1 <- nbinomWaldTest(dds1)
print(attr(dds1, "modelMatrix"))


With the R package limma I could deal with that, too. And if I would apply model.matrix() function to my scenario I would come up with

print(model.matrix(~T1 +T2 + T3 +ctrl , designMat))

But in DESeq I get an error message, if I try to apply that formula.:

dds2 <- DESeqDataSetFromMatrix(countData = data, colData , design =  ~ T1 +T2 + T3 + ctrl ) 

My Problem is, that my design is more complex than that.
It also contains combinations of treatments T1, T2 and T3.
And I would also like to "use" these combinations within the linear model to improve my estimates through a better estimation of variance (like limma() does).

Is there any possibility to pass the design matrix directly to a DESeq2 object?
Is it possible in general to deal with designs like, e.g.:

print(rbind(designMat.f,c(1,1,0,0), c(0,1,1,0)))

in DESeq2?

Thanks in advance for your help!

Best regards,


R version 3.1.2 (2014-10-31)
Platform: x86_64-unknown-linux-gnu (64-bit)

[1] C

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets 
[8] methods   base     

other attached packages:
[1] DESeq2_1.6.1            RcppArmadillo_0.4.500.0 Rcpp_0.11.3            
[4] GenomicRanges_1.18.1    GenomeInfoDb_1.2.2      IRanges_2.0.0          
[7] S4Vectors_0.4.0         BiocGenerics_0.12.0    

loaded via a namespace (and not attached):
 [1] AnnotationDbi_1.28.1 BBmisc_1.8           BatchJobs_1.5       
 [4] Biobase_2.26.0       BiocParallel_1.0.0   DBI_0.3.1           
 [7] Formula_1.1-2        Hmisc_3.14-5         MASS_7.3-35         
[10] RColorBrewer_1.0-5   RSQLite_1.0.0        XML_3.98-1.1        
[13] XVector_0.6.0        acepack_1.3-3.3      annotate_1.44.0     
[16] base64enc_0.1-2      brew_1.0-6           checkmate_1.5.0     
[19] cluster_1.15.3       codetools_0.2-9      colorspace_1.2-4    
[22] digest_0.6.4         fail_1.2             foreach_1.4.2       
[25] foreign_0.8-61       genefilter_1.48.1    geneplotter_1.44.0  
[28] ggplot2_1.0.0        grid_3.1.2           gtable_0.1.2        
[31] iterators_1.0.7      lattice_0.20-29      latticeExtra_0.6-26 
[34] locfit_1.5-9.1       munsell_0.4.2        nnet_7.3-8          
[37] plyr_1.8.1           proto_0.3-10         reshape2_1.4        
[40] rpart_4.1-8          scales_0.2.4         sendmailR_1.2-1     
[43] splines_3.1.2        stringr_0.6.2        survival_2.37-7     
[46] tools_3.1.2          xtable_1.7-4       





ADD COMMENTlink modified 4.1 years ago by Michael Love23k • written 4.1 years ago by taf0
Answer: DESeq2: how to construct the design matrix?
gravatar for Michael Love
4.1 years ago by
Michael Love23k
United States
Michael Love23k wrote:

Hi Franziska,

Maybe this solves your problem most easily: in DESeq2 v1.8 which was released this week, we provided support for users to use a custom model matrix / design matrix, which you can provide to the DESeq() function. See the top NEWS item:

You would provide the matrix to the 'full' argument of DESeq(). You need to specify betaPrior=FALSE, though because the routines which calculate the prior use the information in the design formula.


You suggested you want to use this design:

"print(model.matrix(~T1 +T2 + T3 +ctrl , designMat))"

If you wrap designMat in data.frame(), then you get a rank-deficient design matrix. The one you likely would supply to limma also has a + 0, to remove the intercept term. You can use this design with DESeq2 (but you need to turn off the betaPrior as well, because it doesn't make sense to shrink the betas zero in this case -- they do not represent group fold changes anymore.)

But maybe the point above will allow you to work with the design matrices you want to use.

ADD COMMENTlink written 4.1 years ago by Michael Love23k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 313 users visited in the last hour