DESeq2 interacting with agricolae
2
0
Entering edit mode
@mikaellenzstrube-11247
Last seen 8.4 years ago

Hi,

(i know its not best practice to post a question in multiple fora but I was encouraged to post here instead of Biostars: https://www.biostars.org/p/205585/ )

My issue is that im getting an error when running the DESeq-command while having my own custom-made package loaded. It works fine without my own package, but it need it to load the data. After a little digging around the code, I found out that the package 'agricolae', on which my package depends, is the culprit, specifically that the command 

model.matrix( object  = design(dds),  data = data.frame( colData(dds) ) )

behaves differently when agricolae is loaded, causing an error when DESeq calls it.

The following code runs when agricolae is not loaded:

library(MyPackage) 

countsTable <- aBunchOfMyFunctions
Group <- factor(SomeGroupNames)
coldata1 <- data.frame(Group=Group, row.names = someRowNames )

dds <- DESeqDataSetFromMatrix(countData = countsTable, colData = coldata1, design = ~ Group  )
dds <- DESeq(dds)

using pre-existing size factors
estimating dispersions
found already estimated dispersions, replacing these
gene-wise dispersion estimates
Error in model.matrix.formula(design(object), data = colData(object)) : 
  data must be a data.frame

 

Is agricolae redefining a core function? How do i deal with this?

 

SessionInfo:

R version 3.3.1 (2016-06-21)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252    LC_MONETARY=English_United States.1252 LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] AnalyseBION_0.2            ggplot2_2.1.0              DESeq2_1.10.1              RcppArmadillo_0.7.200.2.0  Rcpp_0.12.6                SummarizedExperiment_1.0.2
 [7] Biobase_2.30.0             GenomicRanges_1.22.4       GenomeInfoDb_1.6.3         IRanges_2.4.8              S4Vectors_0.8.11           BiocGenerics_0.16.1       
[13] BiocInstaller_1.20.3      

loaded via a namespace (and not attached):
 [1] splines_3.3.1        gtools_3.5.0         Formula_1.2-1        sp_1.2-3             latticeExtra_0.6-28  LearnBayes_2.15      RSQLite_1.0.0       
 [8] lattice_0.20-33      chron_2.3-47         RColorBrewer_1.1-2   XVector_0.10.0       naturalsort_0.1.2    colorspace_1.2-6     sandwich_2.3-4      
[15] Matrix_1.2-6         plyr_1.8.4           klaR_0.6-12          XML_3.98-1.4         gmodels_2.16.2       genefilter_1.52.1    zlibbioc_1.16.0     
[22] xtable_1.8-2         mvtnorm_1.0-5        scales_0.4.0         gdata_2.17.0         VennDiagram_1.6.17   BiocParallel_1.4.3   combinat_0.0-8      
[29] annotate_1.48.0      mgcv_1.8-13          TH.data_1.0-7        nnet_7.3-12          agricolae_1.2-4      magrittr_1.5         survival_2.39-5     
[36] deldir_0.1-12        mclust_5.2           maptools_0.8-39      nlme_3.1-128         MASS_7.3-45          gplots_3.0.1         foreign_0.8-66      
[43] beeswarm_0.2.3       vegan_2.4-0          tools_3.3.1          data.table_1.9.6     multcomp_1.4-6       stringr_1.0.0        munsell_0.4.3       
[50] locfit_1.5-9.1       cluster_2.0.4        AnnotationDbi_1.32.3 lambda.r_1.1.9       caTools_1.17.1       futile.logger_1.4.3  grid_3.3.1          
[57] bitops_1.0-6         boot_1.3-18          codetools_0.2-14     gtable_0.2.0         DBI_0.4-1            reshape2_1.4.1       AlgDesign_1.1-7.3   
[64] gridExtra_2.2.1      zoo_1.7-13           Hmisc_3.17-4         spdep_0.6-6          futile.options_1.0.0 KernSmooth_2.23-15   permute_0.9-0       
[71] stringi_1.1.1        geneplotter_1.48.0   rpart_4.1-10         acepack_1.3-3.3      scatterplot3d_0.3-37 coda_0.18-1         
deseq2 software error • 1.8k views
ADD COMMENT
2
Entering edit mode
@mikelove
Last seen 2 days ago
United States

Can you upgrade to the release version of DESeq2 which is 1.12? This may fix it.

Update [8/7/2016] I've updated the code in DESeq2 v1.12.4 which should fix this conflict:

> library(agricolae)
> dds <- makeExampleDESeqDataSet()
> dds <- DESeq(dds)
estimating size factors
estimating dispersions
gene-wise dispersion estimates
mean-dispersion relationship
final dispersion estimates
fitting model and testing
> packageVersion("DESeq2")
[1] ‘1.12.4’
ADD COMMENT
0
Entering edit mode

No such luck, unfortunately

> sessionInfo()
R version 3.3.1 (2016-06-21)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252    LC_MONETARY=English_United States.1252 LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] AnalyseBION_0.2            agricolae_1.2-4            DESeq2_1.12.3              SummarizedExperiment_1.2.3 Biobase_2.32.0             GenomicRanges_1.24.2      
 [7] GenomeInfoDb_1.8.3         IRanges_2.6.1              S4Vectors_0.10.2           BiocGenerics_0.18.0        BiocInstaller_1.22.3      

loaded via a namespace (and not attached):
 [1] splines_3.3.1        gtools_3.5.0         Formula_1.2-1        sp_1.2-3             latticeExtra_0.6-28  LearnBayes_2.15      RSQLite_1.0.0       
 [8] lattice_0.20-33      chron_2.3-47         RColorBrewer_1.1-2   XVector_0.12.1       naturalsort_0.1.2    colorspace_1.2-6     sandwich_2.3-4      
[15] Matrix_1.2-6         plyr_1.8.4           klaR_0.6-12          XML_3.98-1.4         gmodels_2.16.2       genefilter_1.54.2    zlibbioc_1.18.0     
[22] mvtnorm_1.0-5        xtable_1.8-2         scales_0.4.0         gdata_2.17.0         VennDiagram_1.6.17   BiocParallel_1.6.3   combinat_0.0-8      
[29] annotate_1.50.0      mgcv_1.8-13          ggplot2_2.1.0        TH.data_1.0-7        nnet_7.3-12          magrittr_1.5         survival_2.39-5     
[36] deldir_0.1-12        mclust_5.2           maptools_0.8-39      nlme_3.1-128         MASS_7.3-45          gplots_3.0.1         foreign_0.8-66      
[43] beeswarm_0.2.3       vegan_2.4-0          tools_3.3.1          data.table_1.9.6     multcomp_1.4-6       stringr_1.0.0        munsell_0.4.3       
[50] locfit_1.5-9.1       cluster_2.0.4        AnnotationDbi_1.34.4 lambda.r_1.1.9       caTools_1.17.1       futile.logger_1.4.3  grid_3.3.1          
[57] bitops_1.0-6         boot_1.3-18          codetools_0.2-14     gtable_0.2.0         DBI_0.4-1            reshape2_1.4.1       AlgDesign_1.1-7.3   
[64] gridExtra_2.2.1      zoo_1.7-13           Hmisc_3.17-4         spdep_0.6-6          futile.options_1.0.0 KernSmooth_2.23-15   permute_0.9-0       
[71] stringi_1.1.1        Rcpp_0.12.6          geneplotter_1.50.0   rpart_4.1-10         acepack_1.3-3.3      scatterplot3d_0.3-37 coda_0.18-1         
> 
ADD REPLY
0
Entering edit mode
What does agricolae export? Check the NAMESPACE file within the package.
ADD REPLY
0
Entering edit mode

it uses 

exportPattern("^[[:alpha:]]+")

so its not entirely clear.

 

Is there any clever way to list what is actually exported?

ADD REPLY
0
Entering edit mode

hi Mikael, 

Thanks for providing an example where I could track down this bug. I've just pushed version 1.12.4 to release (and a corresponding version to devel) which should solve the error. It will be available in 1-2 days.

ADD REPLY
0
Entering edit mode

Awesome, thanks for the effort.

ADD REPLY
2
Entering edit mode
@wolfgang-huber-3550
Last seen 3 months ago
EMBL European Molecular Biology Laborat…

I think the culprit is here:

> library("DESeq2")
> methods("model.matrix")
[1] model.matrix.Formula* model.matrix.coxph*   model.matrix.default
[4] model.matrix.lm       model.matrix.survreg*
see '?methods' for accessing help and source code
> library("AlgDesign")  ## agricolae depends on this
> methods("model.matrix")
[1] model.matrix.Formula* model.matrix.coxph*   model.matrix.default
[4] model.matrix.formula* model.matrix.lm       model.matrix.survreg*
[7] model.matrix.terms*  

And AlgDesign's model.matrix.formula function explicitly checks whether its second argument inherits from data.frame (type AlgDesign:::model.matrix.formula to check), whereas I believe that before AlgDesign was loaded, model.matrix.default would be called, which simply requires that the second argument behaves like a data.frame (colData is a Bioconductor DataFrame).

A possible fix would be for Mike to change all occurrences of model.matrix in his package into, say, stats:model.matrix.default. Although this seems neither satisfactory nor scalable.

This seems like another instance (C: arrayQualityMetrics does not complete successfully) of global side effects when loading packages and defining and using S3 methods, in a way that collides with R's namespaces. I wonder (i.e., it's knowable, but I don't) whether this was like this always in R, or whether it's a recent feature.

 

 

ADD COMMENT
0
Entering edit mode

I've  already committed a change to devel which wraps the second argument in as.data.frame, and then am about to commit this to release as well. This removes the error. Then the hope is that AlgDesign:::model.matrix.formula behaves the same. I was already specifically importing model.matrix from stats, but then this is not enough apparently?

ADD REPLY
0
Entering edit mode

I think the more robust solution is really to just call explicitly the function you want (e.g. stats::model.matrix.default) rather than letting S3 method dispatch lead you to random functions depending on what other packages a user happens to have attached.
 

ADD REPLY
0
Entering edit mode

That's some solid detective work, I hope the update solves the issue.

ADD REPLY

Login before adding your answer.

Traffic: 593 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6