I am trying to replicate the workflow from library(DirichletMultinomial)
. I have large matrix containing counts count
and vector of factors pheno
. I made a subset of count
into countp
based on pheno
(division into groups) as in the example in vignettes shows
countp <- count[pheno %in% c("group1", "group2"), ]
after executing the next step for Dirichlet-multinominal model for each group:
bestgrp <- dmngroup(countp, pheno, k=1:15, simplify = FALSE, verbose=TRUE, .lapply = parallel::mclapply)
the process was working for 8-12 hours and crashed reporting:
invalid class “DMNGroup” object: undefined class for slot "elementMetadata" ("DataTable_OR_NULL")
I know for sure that the process took several hours since crashed because I have following code
if(exists('bestgrp')) {
save(bestgrp, file = "NRR_DMM_bestgrp.Rda")
} else {
write(geterrmessage(), file = "!!!ERROR.txt") }
and have the time the error message was saved. Since verbose=TRUE
doesn't work in parallel multiple core computing I don't have more specific point when it crashed. The whole dataset count when processed with dmn
goes smooth. I am not a bioinformatician, and new in R computing therefore as Google returns NULL results for the error message I am totally lost. Can anyone help? Best regards, Marcin
ps. I have updated R to 4.0.3, but the error persists.
sessionInfo( )
R version 4.0.3 (2020-10-10)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Big Sur 10.16
Matrix products: default
LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] parallel stats4 stats graphics grDevices utils datasets methods base
other attached packages:
[1] xtable_1.8-4 lattice_0.20-41 readr_1.4.0 BiocParallel_1.24.1
[5] magrittr_2.0.1 reshape2_1.4.4 survival_3.2-7 ggsci_2.9
[9] dplyr_1.0.2 ggpubr_0.4.0 knitr_1.30 microbiome_1.10.0
[13] ggplot2_3.3.3 phyloseq_1.32.0 DirichletMultinomial_1.30.0 IRanges_2.24.1
[17] S4Vectors_0.28.1 BiocGenerics_0.36.0
loaded via a namespace (and not attached):
[1] Rtsne_0.15 colorspace_2.0-0 ggsignif_0.6.0 ellipsis_0.3.1
[5] rio_0.5.16 XVector_0.28.0 GenomicRanges_1.40.0 rstudioapi_0.13
[9] bit64_4.0.5 AnnotationDbi_1.50.3 codetools_0.2-16 splines_4.0.3
[13] geneplotter_1.66.0 ade4_1.7-16 jsonlite_1.7.2 broom_0.7.3
[17] annotate_1.66.0 cluster_2.1.0 compiler_4.0.3 backports_1.2.1
[21] Matrix_1.2-18 prettyunits_1.1.1 tools_4.0.3 igraph_1.2.6
[25] gtable_0.3.0 glue_1.4.2 GenomeInfoDbData_1.2.3 Rcpp_1.0.5
[29] carData_3.0-4 Biobase_2.48.0 cellranger_1.1.0 vctrs_0.3.6
[33] Biostrings_2.56.0 rhdf5filters_1.2.0 multtest_2.44.0 ape_5.4-1
[37] nlme_3.1-149 iterators_1.0.13 xfun_0.20 stringr_1.4.0
[41] openxlsx_4.2.3 lifecycle_0.2.0 rstatix_0.6.0 XML_3.99-0.5
[45] zlibbioc_1.34.0 MASS_7.3-53 scales_1.1.1 hms_1.0.0
[49] MatrixGenerics_1.2.0 SummarizedExperiment_1.18.2 biomformat_1.16.0 rhdf5_2.34.0
[53] RColorBrewer_1.1-2 curl_4.3 memoise_1.1.0 stringi_1.5.3
[57] RSQLite_2.2.2 genefilter_1.70.0 foreach_1.5.1 permute_0.9-5
[61] zip_2.1.1 GenomeInfoDb_1.24.2 rlang_0.4.10 pkgconfig_2.0.3
[65] matrixStats_0.57.0 bitops_1.0-6 purrr_0.3.4 Rhdf5lib_1.12.0
[69] bit_4.0.4 tidyselect_1.1.0 plyr_1.8.6 DESeq2_1.28.1
[73] R6_2.5.0 generics_0.1.0 DelayedArray_0.16.0 DBI_1.1.0
[77] pillar_1.4.7 haven_2.3.1 foreign_0.8-80 withr_2.3.0
[81] mgcv_1.8-33 abind_1.4-5 RCurl_1.98-1.2 tibble_3.0.4
[85] crayon_1.3.4 car_3.0-10 progress_1.2.2 locfit_1.5-9.4
[89] grid_4.0.3 readxl_1.3.1 data.table_1.13.6 blob_1.2.1
[93] vegan_2.5-7 forcats_0.5.0 digest_0.6.27 tidyr_1.1.2
[97] munsell_0.5.0