Reading multiple data segments from FCS 3.0 file
1
0
Entering edit mode
alavy ▴ 30
@alavy-23451
Last seen 2.6 years ago

Hi, I am trying to read a FCS 3.0 file that contains multiple samples taken from several plates using read.ncdfFlowSet The error message I get is:

1: The file contains 48 additional data segments.
The default is to read the first segment only.
Please consider setting the 'dataset' argument.


After reading the documentation of flowCore I understood that I need to specify which data segment I want to read. However, I know that I can read multiple flowframes if each of them is saved as a file on its own.

So I was wondering, is there a way to either: 1. Load all the samples that are in the FCS 3.0 file all at once into a flow set? 2. If not, can the FCS file be split into individual files, so that they can be read into a single flow set?

Thank you,

multiple data segments in a single file

flowCore flow cytometry FCS 3.0 ncdfFlowSet • 1.2k views
2
Entering edit mode
Jake Wagner ▴ 310
@jake-wagner-19995
Last seen 2.1 years ago

Are you using the most recent versions of the packages (available from Bioconductor 3.11 or GitHub)? The easiest way is probably your second approach using load_cytoframe_from_fcs from flowWorkspace. This is just a sketch using an FCS file with 40 data segments. Replace multi_data_fcs_path and outpath to suit your needs.

outpath <- tempdir()
ndataset <- 40
for(i in 1:ndataset){
# Load in each dataset as a cytoframe and write it out as its own FCS
cf <- load_cytoframe_from_fcs(multi_data_fcs_path, dataset = i)
write.FCS(cf, file.path(outpath, paste0("dataset_", i, ".fcs")))
}
# Read the split FCS files back in to a single cytoset
cs <- load_cytoset_from_fcs(list.files(outpath, pattern="*.fcs", full.names = TRUE))


You can also use ncdfFlowSet, but cytoframe and cytoset basically replace it.

0
Entering edit mode

You could also of course add more descriptive names than dataset_n to the FCS for each data segment.

0
Entering edit mode

Thank you for the quick reply, I will try it out. and thank you for mentioning that cytoset replaces ncdFlowSet. I've been going through the packages and I find it hard to understand which package (and command) replaces other old packages and commands....

I've installed the packaged via BiocManager. But I had to use R 3.3.3 which probablty means I am not using the latest bioconductor. I will look into upgrading to R 4.0.

The current versions that are installed:

flowCore 1.52.1 flowWorkspace 3.34.1 flowStats 3.44.0 openCyto 1.24.0

0
Entering edit mode

One more question please, I currently cannot updated to R 4.0 so instead I am using R 3.6.3 with Bioconductor 3.10. I also updated all the packages associated with this script, but load_cytoset_from_fcs is not available in FlowWorkspace 3.34.1 Is there an alternative ?

1
Entering edit mode

Sure, instead you can use read.FCS to split up the data segments and read.flowSet or read.flowSet to read in the set of split files.

outpath <- tempdir()
ndataset <- 40
for(i in 1:ndataset){
# Load in each dataset as a flowFrame and write it out as its own FCS
fr <- read.FCS(multi_data_fcs_path, dataset = i)
write.FCS(fr, file.path(outpath, paste0("dataset_", i, ".fcs")))
}
# Read the split FCS files back in to a single flowSet or ncdfFlowSet
fs <- read.flowSet(list.files(outpath, pattern="*.fcs", full.names = TRUE))
nc <- read.ncdfFlowSet(list.files(outpath, pattern="*.fcs", full.names = TRUE))


And apologies for the lack of clarity about deprecation. It's sort of a big transition period for our packages and we are still working on disentangling some old dependencies and documenting the deprecations. This is also complicated by the dependencies of other Bioconductor packages on flowCore for example.

0
Entering edit mode

Thank you for clarifying Jake!

With the suggested code I get the following error:

outpath <- tempdir()
ndataset <- 40
for(i in 1:ndataset){
# Load in each dataset as a flowFrame and write it out as its own FCS
fr <- read.FCS("2019-10-18_at_01-23-01pm.fcs", dataset = i)
write.FCS(fr, file.path(outpath, paste0("dataset_", i, ".fcs")))
}

Error in fcsTextParse(txt, emptyValue = emptyValue) :
Empty keyword name detected!If it is due to the double delimiters in keyword value, please set emptyValue to FALSE and try again!


I can set the emptyValue to FALSE, but was wondering what are the larger implications of using this setting?

Thank you again for all your help!!

0
Entering edit mode

It's there as a workaround option to change the behavior of handling adjacent delimiters. Default behavior (emptyValue=TRUE) treats them as surrounding an empty keyword value. But adjacent delimiters can occur for other reasons, for example where \ is used as both a delimiter and to denote an escape sequence. There is logic to handle restoring appropriate double delimiters on empty keyword values when necessary even in the case of emptyValue=FALSE, however. It's just a slower process. Anyway, if parsing completes successfully with emptyValue=FALSE, you should be fine, but you can always check the output of keyword to make sure everything is being parsed in appropriately.

0
Entering edit mode

Is there a way to combine the fcs files back into a single file as individual data segments?

0
Entering edit mode

You should better start a question, but here is some clues. While this operation is refered as concatenate you can find it as coerce in flowCore http://rglab.github.io/flowCore/reference/coerce.html There are also other codes such as the concatenate_fcs_files function from Premessa https://github.com/ParkerICI/premessa/blob/467d64150297d83832c3960750ac4792c99153fd/R/fcs_io.R#L129 Or custom code at https://gist.github.com/soh Best

0
Entering edit mode

I was looking to automate this in a function, that can figure out ndataset from the .fcs file directly?

The solution in this post worked for me! - Getting number of datasets with flowCore