Question: mzR unable to open uncorrupted mzXML file
2
gravatar for abrsoule
7 months ago by
abrsoule30
abrsoule30 wrote:

Hello,

I am fairly new to R and to Bioconductor and have been running some code (previously written by another member of our cohort) to analyze our mass spec data. I've run into a couple of issues which I now can't seem to fix.

1) Our peak picking code loops through large numbers of files, nearly 1500 in this most recent project, so we often run into an issue of running out of memory. For instance, today we were able to get through about 300 files before it errored out with the message: Biocparallel error: cannot allocate vector of size 1.2 mb

To tackle this issue, I tried restarting our Windows computer to give the R session a clean slate of memory and begin where we left off. After restarting the computer, the code ran into a new issue. mzR will not open or read our mzXML files. I've isolated the source of the error in our code, which reads as follows:

files_1 <- list.files(paste(mzxml_location, sites[i], "/",species[j],sep=""),
                        recursive = TRUE, full.names = TRUE)
  files <- files_1[endsWith(files_1, "mzXML")]

raw_data <- readMSData(files, msLevel. = 1, mode="onDisk")

The function finding the files works perfectly fine, but the last line results in the error message:

Error in pwizModule$open(filename) : 
  [MSDataFile::readFile()] Unsupported file format.

along with warning messages stating that the directory names for each file are invalid. The files all exist at the specified directories, are uncorrupted, and in .mzXML format. This code also worked without running into this error message before I restarted the computer, so I'm not sure what could have changed or how to proceed. Not sure if it's important, but I did receive this warning message when loading mzR after the computer restart:

 In fun(libname, pkgname) :
      mzR has been built against a different Rcpp version (1.0.0)
    than is installed on your system (1.0.1). This might lead to errors
    when loading mzR. If you encounter such issues, please send a report,
    including the output of sessionInfo() to the Bioc support forum at 
    https://support.bioconductor.org/. For details see also
    https://github.com/sneumann/mzR/wiki/mzR-Rcpp-compiler-linker-issue.

For additional information, here is the output of the sessionInfo():

R version 3.5.2 (2018-12-20)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252    LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                           LC_TIME=English_United States.1252    

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] reshape2_1.4.3        CAMERA_1.38.1         splitstackshape_1.4.6 vegan_2.5-4           lattice_0.20-38      
 [6] permute_0.9-5         pvclust_2.0-0         muma_1.4              rrcov_1.4-7           pcaPP_1.9-73         
[11] caTools_1.17.1.2      bitops_1.0-6          gtools_3.8.1          robustbase_0.93-4     mvtnorm_1.0-10       
[16] gplots_3.0.3          pls_2.7-1             pdist_1.2             car_3.0-3             carData_3.0-2        
[21] snow_0.4-3            xcms_3.4.4            MSnbase_2.8.3         ProtGenerics_1.14.0   S4Vectors_0.20.1     
[26] mzR_2.16.2            Rcpp_1.0.1            BiocParallel_1.16.6   multtest_2.38.0       Biobase_2.42.0       
[31] BiocGenerics_0.28.0  

loaded via a namespace (and not attached):
 [1] nlme_3.1-137           doParallel_1.0.14      RColorBrewer_1.1-2     backports_1.1.3        tools_3.5.2           
 [6] affyio_1.52.0          rpart_4.1-13           KernSmooth_2.23-15     Hmisc_4.2-0            lazyeval_0.2.2        
[11] mgcv_1.8-28            colorspace_1.4-1       nnet_7.3-12            gridExtra_2.3          curl_3.3              
[16] compiler_3.5.2         MassSpecWavelet_1.48.1 preprocessCore_1.44.0  graph_1.60.0           htmlTable_1.13.1      
[21] checkmate_1.9.1        scales_1.0.0           DEoptimR_1.0-8         affy_1.60.0            RBGL_1.58.2           
[26] stringr_1.4.0          digest_0.6.18          foreign_0.8-71         rio_0.5.16             htmltools_0.3.6       
[31] base64enc_0.1-4        pkgconfig_2.0.2        limma_3.38.3           htmlwidgets_1.3        rlang_0.3.2           
[36] readxl_1.3.1           rstudioapi_0.10        impute_1.56.0          BiocInstaller_1.32.1   mzID_1.20.1           
[41] acepack_1.4.1          zip_2.0.1              magrittr_1.5           Formula_1.2-3          MALDIquant_1.19.2     
[46] Matrix_1.2-15          munsell_0.5.0          abind_1.4-7            vsn_3.50.0             stringi_1.4.3         
[51] MASS_7.3-51.1          zlibbioc_1.28.0        plyr_1.8.4             grid_3.5.2             gdata_2.18.0          
[56] forcats_0.4.0          crayon_1.3.4           haven_2.1.0            splines_3.5.2          hms_0.4.2             
[61] knitr_1.22             pillar_1.3.1           igraph_1.2.4           codetools_0.2-16       XML_4.0-0             
[66] latticeExtra_0.6-28    pcaMethods_1.74.0      data.table_1.12.0      BiocManager_1.30.4     foreach_1.5.1         
[71] cellranger_1.1.0       gtable_0.3.0           RANN_2.6.1             ggplot2_3.1.0          xfun_0.5              
[76] openxlsx_4.1.0         ncdf4_1.16.1           survival_2.43-3        tibble_2.1.1           iterators_1.0.11      
[81] IRanges_2.16.0         cluster_2.0.7-1

Any help or advice would be greatly appreciated!

Thanks, Abbey

xcms mzr software error • 150 views
ADD COMMENTlink written 7 months ago by abrsoule30

Before digging into the error message I would make sure that you have all the most recent package versions and that all files are really there.

So, please run BiocManager::install() to update potentially outdated packages and then all(file.exists(files)).

You say you have ~ 1500 files, so maybe it might also be worth trying to check each of them if they could be read:

sapply(files, function(z) {
    cat("reading", z, "...")
    tmp <- readMSData(z, mode = "onDisk")
    cat("OK\n")
    TRUE
})

That way we could figure out which of your files might be problematic and we could proceed checking what is causing the error (or if it's really a problem with MSnbase/mzR we can test this on a single file).

ADD REPLYlink written 7 months ago by Johannes Rainer1.5k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 488 users visited in the last hour