Hello,
I am trying to create a netCDF file compatible with ANDI-MS. The history is as follows:
I am trying to convert mass spec data from Waters .raw to netcdf in order to import into Bruker Compass/Data Analysis.
Waters has an export tool (DataBridge) that does this AND the file can be read by the Bruker software, HOWEVER, DataBridge either centroids the data or INDICATES it is centroid in the cdf output. This is unacceptable as too much information is lost for the deconvolution of intact protein masses. Waters is aware of the problem and is working on it.
I was steered towards Proteowizard to convert to mzXML/mzML then to use Bioconductor XCMS to open/save as netCDF. Unfortunately this was unsuccessful and the XCMS manual explicitly states that XCMS created cdf files are only openable by XCMS and are expressly incompatible with ANDI-MS conventions.
I have been able to use the R package RNetCDF to successfully read the waters output cdf and create a new empty cdf file with the same variables and attributes.
I can use the mzR package to read mzXML output from ProteoWizard.
The goal:
As a first step, I am attempting to (using RNetCDF):
- Open a Databridge exported cdf file
- Create a new CDF file with the same variables and attributes
- Read the data from the cdf file into R and then put it to the variables in the new cdf file
- Save the file and verify it can be opened by the target software
The problem:
I can read data from the source file into an R data element, verify that the data is numeric, but when I put it to the destination variable/file, I get this error:
Error: R character data can only be written to NC_CHAR variable
I have appended the command lines I have used. I think that my problem is with my lack of knowledge of how to handle data in R, so any help would be appreciated. Also, if anyone knows an alternate/better way to get data from an mzXML file to an ANDI-MS compatible netCDF file, please inform me. I still have to verify that the cdf file I create with these tools will be openable by the target software...
Regards,
Ray
APPENDED COMMAND LINES:
CDF1 refers to the SOURCE cdf file
test1 refers to the DESTINATION cdf file
> var.inq.nc(CDF1,"a_d_sampling_rate")
$id
[1] 1
$name
[1] "a_d_sampling_rate"
$type
[1] "NC_DOUBLE"
$ndims
[1] 1
$dimids
[1] 11
$natts
[1] 0
>dim.def.nc(test1,"scan_number",2064)
>var.def.nc(test1,"a_d_sampling_rate","NC_DOUBLE","scan_number")
> var.inq.nc(test1,"a_d_sampling_rate")
$id
[1] 1
$name
[1] "a_d_sampling_rate"
$type
[1] "NC_DOUBLE"
$ndims
[1] 1
$dimids
[1] 11
$natts
[1] 0
> data2 <-var.get.nc(CDF1,"a_d_sampling_rate",start=NA,count=NA,na.mode=0,collapse=TRUE,unpack=TRUE,rawchar=TRUE)
> str(data2)
num [1:2064(1d)] -9999 -9999 -9999 -9999 -9999 ...
> var.put.nc(test1,"a_d_sampling_rate",data1,start=NA,count=NA,na.mode=0,pack=TRUE)
Error: R character data can only be written to NC_CHAR variable
Hello,
Sorry, I also read this in the manual for RNetCDF but am unsure how to apply it:
However, text represented by R types raw and character can only be written to NetCDF type NC_CHAR. The dimensions of R raw variables map directly to NetCDF dimensions, but character variables have an implied dimension corresponding to the string length. This implied dimension must be defined explicitly as the fastest-varying dimension of the NC_CHAR variable, and it must be included as the first element of arguments start and count taken by this function.