Reading multiple data segments from FCS 3.0 file
1
1
Entering edit mode
alavy ▴ 20
@alavy-23451
Last seen 18 months ago

Hi, I am trying to read a FCS 3.0 file that contains multiple samples taken from several plates using read.ncdfFlowSet The error message I get is:

1: The file contains 48 additional data segments.
The default is to read the first segment only.
Please consider setting the 'dataset' argument.

After reading the documentation of flowCore I understood that I need to specify which data segment I want to read. However, I know that I can read multiple flowframes if each of them is saved as a file on its own.

So I was wondering, is there a way to either: 1. Load all the samples that are in the FCS 3.0 file all at once into a flow set? 2. If not, can the FCS file be split into individual files, so that they can be read into a single flow set?

Thank you,

multiple data segments in a single file

flowCore flow cytometry FCS 3.0 ncdfFlowSet • 541 views
ADD COMMENT
2
Entering edit mode
Jake Wagner ▴ 280
@jake-wagner-19995
Last seen 12 months ago

Are you using the most recent versions of the packages (available from Bioconductor 3.11 or GitHub)? The easiest way is probably your second approach using load_cytoframe_from_fcs from flowWorkspace. This is just a sketch using an FCS file with 40 data segments. Replace multi_data_fcs_path and outpath to suit your needs.

outpath <- tempdir()
ndataset <- 40
for(i in 1:ndataset){
  # Load in each dataset as a cytoframe and write it out as its own FCS
  cf <- load_cytoframe_from_fcs(multi_data_fcs_path, dataset = i)
  write.FCS(cf, file.path(outpath, paste0("dataset_", i, ".fcs")))
}
# Read the split FCS files back in to a single cytoset
cs <- load_cytoset_from_fcs(list.files(outpath, pattern="*.fcs", full.names = TRUE))

You can also use ncdfFlowSet, but cytoframe and cytoset basically replace it.

ADD COMMENT
0
Entering edit mode

You could also of course add more descriptive names than dataset_n to the FCS for each data segment.

ADD REPLY
0
Entering edit mode

Thank you for the quick reply, I will try it out. and thank you for mentioning that cytoset replaces ncdFlowSet. I've been going through the packages and I find it hard to understand which package (and command) replaces other old packages and commands....

I've installed the packaged via BiocManager. But I had to use R 3.3.3 which probablty means I am not using the latest bioconductor. I will look into upgrading to R 4.0.

The current versions that are installed:

flowCore 1.52.1 flowWorkspace 3.34.1 flowStats 3.44.0 openCyto 1.24.0

ADD REPLY
0
Entering edit mode

One more question please, I currently cannot updated to R 4.0 so instead I am using R 3.6.3 with Bioconductor 3.10. I also updated all the packages associated with this script, but load_cytoset_from_fcs is not available in FlowWorkspace 3.34.1 Is there an alternative ?

ADD REPLY
1
Entering edit mode

Sure, instead you can use read.FCS to split up the data segments and read.flowSet or read.flowSet to read in the set of split files.

outpath <- tempdir()
ndataset <- 40
for(i in 1:ndataset){
  # Load in each dataset as a flowFrame and write it out as its own FCS
  fr <- read.FCS(multi_data_fcs_path, dataset = i)
  write.FCS(fr, file.path(outpath, paste0("dataset_", i, ".fcs")))
}
# Read the split FCS files back in to a single flowSet or ncdfFlowSet
fs <- read.flowSet(list.files(outpath, pattern="*.fcs", full.names = TRUE))
nc <- read.ncdfFlowSet(list.files(outpath, pattern="*.fcs", full.names = TRUE))

And apologies for the lack of clarity about deprecation. It's sort of a big transition period for our packages and we are still working on disentangling some old dependencies and documenting the deprecations. This is also complicated by the dependencies of other Bioconductor packages on flowCore for example.

ADD REPLY
0
Entering edit mode

Thank you for clarifying Jake!

With the suggested code I get the following error:

outpath <- tempdir()
ndataset <- 40
for(i in 1:ndataset){
  # Load in each dataset as a flowFrame and write it out as its own FCS
  fr <- read.FCS("2019-10-18_at_01-23-01pm.fcs", dataset = i)
  write.FCS(fr, file.path(outpath, paste0("dataset_", i, ".fcs")))
}

Error in fcsTextParse(txt, emptyValue = emptyValue) : 
  Empty keyword name detected!If it is due to the double delimiters in keyword value, please set emptyValue to FALSE and try again!

I can set the emptyValue to FALSE, but was wondering what are the larger implications of using this setting?

Thank you again for all your help!!

ADD REPLY
0
Entering edit mode

It's there as a workaround option to change the behavior of handling adjacent delimiters. Default behavior (emptyValue=TRUE) treats them as surrounding an empty keyword value. But adjacent delimiters can occur for other reasons, for example where \ is used as both a delimiter and to denote an escape sequence. There is logic to handle restoring appropriate double delimiters on empty keyword values when necessary even in the case of emptyValue=FALSE, however. It's just a slower process. Anyway, if parsing completes successfully with emptyValue=FALSE, you should be fine, but you can always check the output of keyword to make sure everything is being parsed in appropriately.

ADD REPLY
0
Entering edit mode

If I would like to automate this in a function, is there any way or function that can figure out ndataset from the .fcs file directly? I guess this is a precursor to the first approach mentioned in the question - some kind of vectorized way to read in multiple data segments into a flowSet.

ADD REPLY

Login before adding your answer.

Traffic: 203 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6