Unexpected total ion count (TIC) with xcms package
1
0
Entering edit mode
Johannes Rainer ★ 1.9k
@johannes-rainer-6987
Last seen 8 days ago
Italy

dear all!

I was trying various things with metabolomics data in xcms. In particular, I wanted to look at the total ion count (TIC), which, following http://www.ncbi.nlm.nih.gov/pubmed/25078324 is the "sum of all signals across all m/z" for a given retention time RT. A TIC can be generated using the plotTIC function in xcms, but, in order to get a feeling of the data, I wanted to generate the plot on the data myself. So I extracted the raw data matrix, summed up the intensity values per time point but to my surprise the plots look different, with the plotTIC resulting in higher intensities.

The code to generate the plots was:

> library(xcms)
> cdfpath <- system.file("cdf", package="faahKO")
> cdffiles <- list.files(cdfpath, recursive=TRUE, full.names=TRUE)
> xraw <- xcmsRaw(cdffiles[1], profmethod="bin", profstep=0.1)
> ## get the raw matrix and sum up the intensities per time point
> rawmat <- rawMat(xraw)
> aggr <- aggregate(rawmat, by=list(rawmat[, 1]), FUN=sum)
> ## plot the TIC
> plotTIC(xraw)
> points(aggr[, 1], aggr[, 4], col="red", type="l")

I can to some extend understand that plotTIC and plotChrom generate different plots, as plotTIC bases on the raw data and plotChrom on the profile data, but it puzzles me why there is a difference between the plotTIC and the sum of of intensities as I calculated them.

I think I am missing here something...

any help is very much appreciated

my session info:

> sessionInfo()
R version 3.2.0 (2015-04-16)
Platform: x86_64-apple-darwin14.3.0/x86_64 (64-bit)
Running under: OS X 10.10.3 (Yosemite)

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] grid      parallel  stats     graphics  grDevices utils     datasets
[8] methods   base

other attached packages:
[1] lattice_0.20-31     xcms_1.45.0         ProtGenerics_1.0.0
[4] mzR_2.2.0           Rcpp_0.11.5         ascii_2.1
[7] RColorBrewer_1.1-2  Biobase_2.28.0      BiocGenerics_0.14.0

loaded via a namespace (and not attached):
[1] compiler_3.2.0   tools_3.2.0      codetools_0.2-11

xcms metabolomics • 1.4k views
1
Entering edit mode
@thomas-lin-pedersen-5941
Last seen 5.3 years ago
Copenhagen, Denmark

I can only give you my best guess as I'm not near a computer to test it out.

I'm quite sure this is true for mzXML/mzML files but uncertain with regards to cdf. Here goes: The file you're working with has a hard coded tic value for each scan that gets read in when you create an xcmsRaw object. This value comes from the instrument and is probably based on profile data. When you calculate it yourself you're working directly with the processed data and the results vary because of this.

Best Thomas

0
Entering edit mode

maybe I should have added some more information. Actually, I first came across this on one of my own files which is a mzML file in centroid mode; also the test file used above is in centroid mode. I just briefly looked at the code in the xcms package (actually the c-code) and as far as I understood there, it is also just summing the signal. it's puzzling...

0
Entering edit mode

Just had a look at the source and it is indeed reading hard coded values if they are present - you can check if these values are there by looking at object@tic. If your object have content in the tic slot then thats the answer to your question...

0
Entering edit mode

Out of curiosity - what c-code? plotTIC is pure R and all internal parsing of raw data is handled by mzR...

0
Entering edit mode

the plotTIC calls rawEIC which calls using .Call the getEIC c function in mzROI.c. I'll try to find time next week to investigate that further. It really bugs me when I don't understand what's going on...

1
Entering edit mode

But only if the tic slot of Tour xcmsRaw object is empty, which it shouldn't be in case of mzML/mzXML (again - never used netCDF so wouldn't know about that) - have you checked the content of the tic slot?

0
Entering edit mode

Yes, you're right. That's indeed the case, the @tic slot contains values that are pretty different from the sum over all intensities per scan that I get on the rawMat matrix. I guess that has something to do with the centroidizing? As far as I understand the values for the @tic slot are extracted from the scan header parameters in the mzML file.

I'm just wondering now what is more representative... the total ion current reported in the mzML file or the sum of all intensities per scan across all m/z values calculated for the actual raw data that is available in the xcmsRaw object.

0
Entering edit mode

As I said, yes, these values represent the true unprocessed total ion count as reported by the instruments, thus before any processing of the spectra is done. As to what to use it depends - If you've done serious changes to the spectra I would recalculate. Otherwise the differences between the two are mostly in scale, and the general pattern should be the same - In that case it's simply faster to extract the hard-coded values...

0
Entering edit mode

Nice; now I feel quite comfortable with the data. Thanks a lot for your explanations and thoughts.