Scientists that perform the benchwork and acquisition of flow cytometry data at my institute export FCS files from FlowJo after both compensating the data and removing dead cells/populations that are not of interest. This data is then delivered to our computational group for additional analysis. I'm a relative novice when it comes to flow data, and I have a couple of general questions about shifting the data from FlowJo to R and performing the pre-processing steps.
The biexponential transformation is applied to some of the data when our wet-lab folks open and start gating the data. The exported FCS files then often end up with some markers being transformed (sometimes this affects the most recently opened marker alone, sometimes this affects nearly all of the markers). When the analysts receive the data, we generally want one transformation (typically either logicle or asinh with b=1/150) applied to each of the markers. Clearly, we do not want to double-transform any of the data.
transformation = "linearize" in
read.FCS only act on data that are stored with different exponentiation? I've received some compensated and compensated + transformed files that are identical when read in with
transformation = F. These files' expression values differ when imported using
transformation = "linearize", but the output of
keyword($P¡n¿E) looks the same for both files.
2. Is there a way to remove FlowJo's transformations from the exported FCS files? Alternatively, is there a way to select the portion of the data for export and then remove all transformations in FlowJo?
3. Are there best practices when exporting from FlowJo before importing with flowCore or analyzing with the core cytofkit function (i.e. options which should always be checked or methods cleaning/prepping data that won't change just a subset of the data)? We'd like to avoid bringing in partially-transformed data with boundary events and other noise, but we would like to be able to hone in on our population of interest without asking a whole new set of people to apply gates to the data (particularly since we occasionally work with rare populations that aren't amenable to automatic gating).
4. Can we trust that the compensation has been appropriately applied to all of the markers in FlowJo, or should we (re)apply compensation once the data is in R?
5. I've read that, at least for CyTOF data, it's important to standardize marker values to z-scores or something similar in order to prevent markers with a higher dynamic range from dominating downstream clustering/dimension reduction. Is this also true for flow data?