Question: Use surrogate variables on DESeq2
0
gravatar for aristotele_m
3.9 years ago by
aristotele_m30
Italy
aristotele_m30 wrote:

Hi!

I have this problem. I want to model a surrogate variable on my data. I have two  two type of tumors but the PCA can not  show the separation between that.

I try to use sva as describe in vignette  but I found this error:

 

dat <- counts(dds, normalized=TRUE)
idx <- rowMeans(dat) > 1
dat <- dat[idx,]
mod <- model.matrix(~ condition, colData(dds))
mod0 <- model.matrix(~ 1, colData(dds))
n.sv = num.sv(dat, mod0,method="leek")
svseq <- svaseq(dat, mod, mod0, n.sv=3)
png("Sva1.png")
plot(svseq$sv[,1], svseq$sv[,2], col=dds$condition, pch=16)
dev.off()
n.sv
svseq <- svaseq(dat, mod, mod0, n.sv=8)

This is the error and this is the previous image:

> dat <- counts(dds, normalized=TRUE)
> idx <- rowMeans(dat) > 1
> dat <- dat[idx,]
> mod <- model.matrix(~ condition, colData(dds))
> mod0 <- model.matrix(~ 1, colData(dds))
> n.sv = num.sv(dat, mod0,method="leek")
> svseq <- svaseq(dat, mod, mod0, n.sv=3)
Number of significant surrogate variables is:  3
Iteration (out of 5 ):1  2  3  4  5  
> png("Sva1.png")
> plot(svseq$sv[,1], svseq$sv[,2], col=dds$condition, pch=16)
> dev.off()
RStudioGD
        2
> n.sv
[1] 8
> svseq <- svaseq(dat, mod, mod0, n.sv=8)
Number of significant surrogate variables is:  8
Iteration (out of 5 ):
Error in solve.default(t(mod) %*% mod) :
  system is computationally singular: reciprocal condition number = 5.89573e-19

 session_info(
+ )
Session info ----------------------------------------------------------------------------------------------------------------------
 setting  value                       
 version  R version 3.2.1 (2015-06-18)
 system   i686, linux-gnu             
 ui       RStudio (0.99.441)          
 language en_US                       
 collate  en_US.UTF-8                 
 tz       <NA>                        

Packages --------------------------------------------------------------------------------------------------------------------------
 package        * version     date       source                           
 acepack          1.3-3.3     2014-11-24 CRAN (R 3.2.0)                   
 annotate       * 1.46.1      2015-08-06 Bioconductor                     
 AnnotationDbi  * 1.30.1      2015-06-05 Bioconductor                     
 Biobase        * 2.28.0      2015-06-05 Bioconductor                     
 BiocGenerics   * 0.14.0      2015-06-05 Bioconductor                     
 BiocParallel     1.2.20      2015-08-17 Bioconductor                     
 BioJob         * 0.1         2015-08-19 local                            
 biomaRt        * 2.24.0      2015-06-05 Bioconductor                     
 bitops           1.0-6       2013-08-17 CRAN (R 3.0.2)                   
 calibrate      * 1.7.2       2013-09-10 CRAN (R 3.2.1)                   
 caTools          1.17.1      2014-09-10 CRAN (R 3.2.0)                   
 cluster          2.0.3       2015-07-21 CRAN (R 3.2.1)                   
 colorspace       1.2-6       2015-03-11 CRAN (R 3.1.3)                   
 curl             0.9.2       2015-08-08 CRAN (R 3.2.1)                   
 DBI            * 0.3.1       2014-09-24 CRAN (R 3.1.1)                   
 DESeq2         * 1.8.1       2015-06-05 Bioconductor                     
 devtools       * 1.8.0       2015-05-09 CRAN (R 3.2.1)                   
 digest           0.6.8       2014-12-31 CRAN (R 3.1.3)                   
 edgeR          * 3.10.2      2015-08-18 Bioconductor                     
 foreign          0.8-66      2015-08-19 CRAN (R 3.2.1)                   
 Formula          1.2-1       2015-04-07 CRAN (R 3.2.0)                   
 futile.logger    1.4.1       2015-04-20 CRAN (R 3.2.0)                   
 futile.options   1.0.0       2010-04-06 CRAN (R 3.2.0)                   
 gdata            2.17.0      2015-07-04 CRAN (R 3.2.1)                   
 genefilter     * 1.50.0      2015-06-05 Bioconductor                     
 geneplotter      1.46.0      2015-06-05 Bioconductor                     
 GenomeInfoDb   * 1.4.2       2015-08-17 Bioconductor                     
 GenomicRanges  * 1.20.5      2015-06-22 Bioconductor                     
 ggplot2        * 1.0.1       2015-03-17 CRAN (R 3.1.3)                   
 git2r            0.11.0      2015-08-12 CRAN (R 3.2.1)                   
 GO.db          * 3.1.2       2015-06-05 Bioconductor                     
 gplots         * 2.17.0      2015-05-02 CRAN (R 3.2.0)                   
 gridExtra        2.0.0       2015-07-14 CRAN (R 3.2.1)                   
 gtable           0.1.2       2012-12-05 CRAN (R 3.1.0)                   
 gtools           3.5.0       2015-05-29 CRAN (R 3.2.0)                   
 Hmisc            3.16-0      2015-04-30 CRAN (R 3.2.0)                   
 IRanges        * 2.2.7       2015-08-17 Bioconductor                     
 KernSmooth       2.23-15     2015-06-29 CRAN (R 3.2.1)                   
 labeling         0.3         2014-08-23 CRAN (R 3.1.1)                   
 lambda.r         1.1.7       2015-03-20 CRAN (R 3.2.0)                   
 lattice          0.20-33     2015-07-14 CRAN (R 3.2.1)                   
 latticeExtra     0.6-26      2013-08-15 CRAN (R 3.2.0)                   
 limma          * 3.24.15     2015-08-17 Bioconductor                     
 locfit           1.5-9.1     2013-04-20 CRAN (R 3.1.0)                   
 magrittr         1.5         2014-11-22 CRAN (R 3.1.3)                   
 MASS           * 7.3-43      2015-07-16 CRAN (R 3.2.1)                   
 Matrix           1.2-2       2015-07-08 CRAN (R 3.2.1)                   
 memoise          0.2.1       2014-04-22 CRAN (R 3.2.1)                   
 mgcv           * 1.8-7       2015-07-23 CRAN (R 3.2.1)                   
 munsell          0.4.2       2013-07-11 CRAN (R 3.1.0)                   
 nlme           * 3.1-122     2015-08-19 CRAN (R 3.2.1)                   
 nnet             7.3-10      2015-06-29 CRAN (R 3.2.1)                   
 Nozzle.R1      * 1.1-1       2015-07-31 local                            
 org.Hs.eg.db   * 3.1.2       2015-06-05 Bioconductor                     
 pheatmap       * 1.0.7       2015-07-02 CRAN (R 3.2.1)                   
 plyr             1.8.3       2015-06-12 CRAN (R 3.2.0)                   
 proto            0.3-10      2012-12-22 CRAN (R 3.1.0)                   
 rafalib        * 1.0.0       2015-08-19 Github (ririzarr/rafalib@99a13e7)
 RColorBrewer   * 1.1-2       2014-12-07 CRAN (R 3.1.2)                   
 Rcpp           * 0.12.0      2015-07-25 CRAN (R 3.2.1)                   
 RcppArmadillo  * 0.5.400.2.0 2015-08-17 CRAN (R 3.2.1)                   
 RCurl            1.95-4.7    2015-06-30 CRAN (R 3.2.1)                   
 reshape2         1.4.1       2014-12-06 CRAN (R 3.1.2)                   
 roxygen2       * 4.1.1       2015-04-15 CRAN (R 3.2.1)                   
 rpart            4.1-10      2015-06-29 CRAN (R 3.2.1)                   
 RSQLite        * 1.0.0       2014-10-25 CRAN (R 3.1.1)                   
 rversions        1.0.2       2015-07-13 CRAN (R 3.2.1)                   
 S4Vectors      * 0.6.3       2015-08-06 Bioconductor                     
 scales           0.2.5       2015-06-12 CRAN (R 3.2.0)                   
 stringi          0.5-5       2015-06-29 CRAN (R 3.2.1)                   
 stringr          1.0.0       2015-04-30 CRAN (R 3.1.3)                   
 survival         2.38-3      2015-07-02 CRAN (R 3.2.1)                   
 sva            * 3.14.0      2015-06-22 Bioconductor                     
 XML            * 3.98-1.3    2015-06-30 CRAN (R 3.2.1)                   
 xml2             0.1.1       2015-06-02 CRAN (R 3.2.1)                   
 xtable           1.7-4       2014-09-12 CRAN (R 3.1.1)                   
 XVector          0.8.0       2015-06-05 Bioconductor 

Any idea what can I do?

ADD COMMENTlink modified 3.1 years ago by szenitha0 • written 3.9 years ago by aristotele_m30
Answer: Use surrogate variables on DESeq2
0
gravatar for Jeff Leek
3.9 years ago by
Jeff Leek560
United States
Jeff Leek560 wrote:

It looks like when you use the estimated number of surrogate variables (8) rather than the number you tried originally (3) the model matrix becomes singular. This often happens for one of two reasons:

(1) The sample size is relatively small so you end up with too many surrogate variables leading to a computationally singular matrix. This appears likely to be the case because based on the PCA plot you have something like a sample size of 9 and 8 surrogate variables are getting estimated. Here you can either just go with a smaller number (like the 3 you chose or some other number based on "elbow" in the scree plot) or you could try method="be" which is often a bit more conservative. 

(2) This probably isn't the reason for your case, but when unmodeled confounders like batch and the tumor groups are highly confounded you can often see that the model matrix is poorly conditioned leading to that error when the surrogate variables are estimated. 

 

Best

Jeff

ADD COMMENTlink written 3.9 years ago by Jeff Leek560
Answer: Use surrogate variables on DESeq2
0
gravatar for szenitha
3.1 years ago by
szenitha0
szenitha0 wrote:

When  the model matrix becomes singular how do we know which of the two reasons to consider? Also should we always consider all the estimted surrogates or the number of surrogates suggested when we use method=be?is there a way to know which surrogates explain most of the variation?

ADD COMMENTlink modified 3.1 years ago • written 3.1 years ago by szenitha0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 316 users visited in the last hour