Question: Memory problem using openWorkspace()
1
gravatar for Silke Zachariae
5.0 years ago by
Germany
Silke Zachariae30 wrote:

Hi,

I seem to have a problem with the openWorkspace() and closeWorkspace().

openWorkspace() occupies a lot of RAM on my system. About 20MB. This memory is not freed, when I try to close the workspace [using closeWorkspace()] or when I remove it [using rm()].

As a consequence, my memory gets more and more occupied and my computer finally crashes ;( I am using Rstudio with R3.1.1. and Windows 7.

Best regards, Silke

> i <- wsp_file
> ws <- openWorkspace(i)
> closeWorkspace(ws)

 

 

flowworkspace • 815 views
ADD COMMENTlink written 5.0 years ago by Silke Zachariae30

What package is this openWorkspace function defined in?

ADD REPLYlink written 5.0 years ago by Steve Lianoglou12k

I'm guessing it comes from flowWorkspace but it would be best if the original poster would provide this information along with the complete code needed to reproduce the problem (how is wsp_file defined?) and the output of sessionInfo().

ADD REPLYlink written 5.0 years ago by Dan Tenenbaum8.2k
1

The openWorkspace() function is defined in the flowWorkspace() package. wsp_file is a character-string indicating the path to the FlowJo-Workspace-File that should be "opened" and parsed afterwards using parseWorkspace(ws). The FlowJo workspace file is an XML-file. The sessionInfo() is as posted below. My current assumption is, that the memory leak is caused by the XML package, which seemed to be used by openWorkspace(). Currently, I have libxml2 2.9.1 installed. There is an update (2.9.2) from 2 weeks ago, which is not yet available for Windows.

The details from the closeWorkspace-help tell me:

Open an XML flowJo workspace file and return a flowJoWorkspace object. The workspace is represented using a XMLInternalDocumentobject. Close a flowJoWorkpsace after finishing with it. This is necessary to explicitly clean up the C-based representation of the XML tree. (See the XML package).

However, this cleanup does not seem to work, since the memory of 20 MB is not freed afterwards. gc() does not work, since R seems to have no control over the C-based representation. I can open the wsp-File, parse the Workspace, read the necessary information. I do not get errors or warnings, however, the memory runs over.

 

library(flowWorkspace)
i <- wsp_file
ws <- openWorkspace(i)
closeWorkspace(ws)

 

 

> sessionInfo()
R version 3.1.1 (2014-07-10)
Platform: x86_64-w64-mingw32/x64 (64-bit)

locale:
[1] LC_COLLATE=German_Germany.1252  LC_CTYPE=German_Germany.1252    LC_MONETARY=German_Germany.1252
[4] LC_NUMERIC=C                    LC_TIME=German_Germany.1252    

attached base packages:
[1] grid      stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] xlsx_0.5.7                xlsxjars_0.6.1            rJava_0.9-6               dplyr_0.3.0.2            
 [5] plyr_1.8.1                car_2.0-21                stringr_0.6.2             flowWorkspace_3.12.0     
 [9] gridExtra_0.9.1           ncdfFlow_2.12.0           BH_1.54.0-4               RcppArmadillo_0.4.450.1.0
[13] flowViz_1.30.0            lattice_0.20-29           flowCore_1.32.0          

loaded via a namespace (and not attached):
 [1] assertthat_0.1      Biobase_2.26.0      BiocGenerics_0.12.0 chron_2.3-45        corpcor_1.6.7       data.table_1.9.4   
 [7] DBI_0.3.1           DEoptimR_1.0-2      graph_1.44.0        hexbin_1.27.0       IDPmisc_1.1.17      KernSmooth_2.23-13 
[13] latticeExtra_0.6-26 magrittr_1.0.1      MASS_7.3-35         mvtnorm_1.0-0       nnet_7.3-8          parallel_3.1.1     
[19] pcaPP_1.9-50        RColorBrewer_1.0-5  Rcpp_0.11.3         reshape2_1.4        Rgraphviz_2.10.0    robustbase_0.91-1  
[25] rrcov_1.3-4         stats4_3.1.1        tools_3.1.1         XML_3.98-1.1        zlibbioc_1.12.0 


 

ADD REPLYlink written 5.0 years ago by Silke Zachariae30

To verify your assumption, try this to see if there is still 20M leaking,

library(XML)
doc <- xmlTreeParse(wsp_file, useInternalNodes = TRUE)
free(doc)
rm(doc)
 
ADD REPLYlink written 4.8 years ago by Jiang, Mike1.2k

I tried a systematic evaluation of the problem: I wrote 4 different functions to use them with lapply on 10 Workspace files. The MB occupied in the RAM I wrote behind the lapply statements. Somehow, it seems that the more XML object are generated (even implicitly) the more RAM is occupied.

extract_WSP_info_1 <- function (wsp_file) {
  doc <- xmlTreeParse(wsp_file, useInternalNodes = TRUE)
  free(doc)
  rm(doc)
}

extract_WSP_info_2 <- function (wsp_file) {
   ws <- openWorkspace(wsp_file)
  closeWorkspace(ws)
}

extract_WSP_info_3 <- function (wsp_file) {
  xml_content <- xmlTreeParse(wsp_file,useInternalNodes=TRUE)
  wsp_version <- xmlAttrs(xmlRoot(xml_content))["version"]
  fj_version <- xmlAttrs(xmlRoot(xml_content))["flowJoVersion"]
  free(xml_content)
  rm(xml_content)
}

extract_WSP_info_4 <- function (wsp_file) {
  xml_content <- xmlTreeParse(wsp_file,useInternalNodes=TRUE)
  xml_root <- xmlRoot(xml_content)
  wsp_version <- xmlAttrs(xml_root)["version"]
  fj_version <- xmlAttrs(xml_root)["flowJoVersion"]
  free(xml_content)
  rm(xml_content,xml_root)
}

lapply(wsp_files[1:10],extract_WSP_info_1)
4.228-4.237=9
lapply(wsp_files[1:10],extract_WSP_info_2)
4.237-4.323=86
lapply(wsp_files[1:10],extract_WSP_info_3)
4.323-4.419=96
lapply(wsp_files[1:10],extract_WSP_info_4)
4.784-4.846=62

 

ADD REPLYlink written 4.7 years ago by Silke Zachariae30
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 244 users visited in the last hour