I am trying to generate small files to include in my package for regression tests. One of them is a small DESeqDataSet object (object
dds_small below, the first 50 features from a complete analysis store in object
dds). However, when I save the small object, its size remains very large:
> dds <- readRDS("2018-02-12_all_tissues/dds.rds") > object.size(dds) 12579016 bytes > dds_small <- dds[1:50,] > object.size(dds_small) 111056 bytes > length(serialize(dds_small, NULL))  45625706
The size of the small object seems larger than the size of the original object! It seems to be the
design slot which uses so much space, as there appears to be an environment attached to it:
> dds_small@design ~(Tissue/Age)/Genotype <environment: 0x3e64708> > object.size(dds_small@design) 1344 bytes > length(serialize(dds_small@design, NULL))  45353218
This environment probably stores a bunch of packages that were in use when the original object was created, because the
sessionInfo (below) reports many loaded packages, although I just did a
readRDS command in a fresh R session.
As I am not familiar with environments nor with
DESeqDataSet internals, my question is: how should I do to keep my subset object size small?
Thanks for your help,
> sessionInfo() R version 3.5.1 (2018-07-02) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 16.04.5 LTS Matrix products: default BLAS: /home/eblanc/R/R-3.5.1/lib/libRblas.so LAPACK: /home/eblanc/R/R-3.5.1/lib/libRlapack.so locale:  C attached base packages:  stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached):  Biobase_2.40.0 bit64_0.9-7  splines_3.5.1 Formula_1.2-3  assertthat_0.2.0 stats4_3.5.1  latticeExtra_0.6-28 blob_1.1.1  GenomeInfoDbData_1.1.0 pillar_1.3.0  RSQLite_2.1.1 backports_1.1.2  lattice_0.20-35 glue_1.3.0  digest_0.6.17 GenomicRanges_1.32.7  RColorBrewer_1.1-2 XVector_0.20.0  checkmate_1.8.5 colorspace_1.3-2  htmltools_0.3.6 Matrix_1.2-14  plyr_1.8.4 DESeq2_1.20.0  XML_3.98-1.16 pkgconfig_2.0.2  rseqCP_0.1.0 genefilter_1.62.0  zlibbioc_1.26.0 purrr_0.2.5  xtable_1.8-3 scales_1.0.0  BiocParallel_1.14.2 htmlTable_1.12  tibble_1.4.2 annotate_1.58.0  IRanges_2.14.12 ggplot2_3.0.0  SummarizedExperiment_1.10.1 nnet_7.3-12  BiocGenerics_0.26.0 lazyeval_0.2.1  survival_2.42-3 magrittr_1.5  crayon_1.3.4 memoise_1.1.0  foreign_0.8-70 tools_3.5.1  data.table_1.11.6 matrixStats_0.54.0  stringr_1.3.1 S4Vectors_0.18.3  locfit_1.5-9.1 munsell_0.5.0  cluster_2.0.7-1 DelayedArray_0.6.6  AnnotationDbi_1.42.1 bindrcpp_0.2.2  compiler_3.5.1 GenomeInfoDb_1.16.0  rlang_0.2.2 grid_3.5.1  RCurl_1.95-4.11 rstudioapi_0.7  htmlwidgets_1.2 bitops_1.0-6  base64enc_0.1-3 gtable_0.2.0  DBI_1.0.0 R6_2.2.2  gridExtra_2.3 knitr_1.20  dplyr_0.7.6 bit_1.1-14  bindr_0.1.1 Hmisc_4.1-1  stringi_1.2.4 parallel_3.5.1  Rcpp_0.12.18 geneplotter_1.58.0  rpart_4.1-13 acepack_1.4.1  tidyselect_0.2.4