dear all!
I just started using the xcms
package for metabolomics data analysis. I was however puzzled, or better said, did not understand the logic behind it. From intuition I was expecting the xcmsRaw
class to represent the raw data and the xcmsSet
the preprocessed data. While indeed that seems to be the case, there seems to be no way to get from a (list) of xcmsRaw
classes to a xcmsSet
object. I was expecting that a function like findPeaks
would somehow return a xcmsSet
but that was not the case. I actually would like (as I was used from microarray data and sequencing) to first look at the raw data, perform some quality controls and then process that raw data into the final data (which I thought might be the xcmsSet
).
Is there a simple way to get from the raw data to the peak list data? I find it quite cumbersome to first load the raw data, do quality controls, and than basically re-load and process the raw data again (using the xcmsSet
function) to generate the xcmsSet
object.
thanks in advance for any help, suggestions etc
jo
Your package sounds promising; any time line you expect it to be more-or-less usable? I was about to implement some stuff for the
xcms
package, but eventually I should do that forMSsary
;)Is the data really that big? I wonder if it shouldn't be possible to reduce the size of the data using special data types like
Rle
or alike...With regards to data size, it really depends on your field - Proteomics, where I come from, are using instruments with ridiculous resolution and therefore huge files, metabolomic studies sometimes use instruments with unit resolution resulting in much smaller files. Anyway it is not so much how you encode it that is the culprit of the data size - It’s just the immense amount of raw numbers…
What MSsary does is to write changes to the underlying raw data into an sqlite database and automatically figures out where to look for it. Thus it never really have everything in memory at the same time, only the relevant pieces. The idea is though that the user shouldn’t really care about all these underlying details… As for the timeline of the package, I’m afraid my development have been stifled a bit in changes to the scope of my PhD, so It’s is currently on hold. I’m quite invested in it though, so it will be taken up again (and contributions are very welcome), but until then you would be better of sticking to xcms.
Thanks for the explanation. So I'll stick for now with xcms and will watch the MSsary on github.