Entering edit mode
Josef Spidlen
▴
140
@josef-spidlen-3720
Last seen 10.4 years ago
Hi Mike,
I agree that empty keyword values are illegal according to the FCS
data
file standard. Unfortunately, there are several vendors breaking this
rule (e.g., CELLQuest/FACSCalibur, Partec, Applied Biosystems /
Attune).
Consequently, I agree with Kieran that it would be better if flowCore
"closed one eye" and allowed reading of those files.
Technically, I believe it can still be done while being able to
distinguish whether the <delimiter_char> is an actual delimiter or
part
of the keyword value. When starting to read a keyword value, your
parser
could distinguish the following states.
The stream with the keyword value right after reading the initiating
<delimiter_char> starts with:
1) <delimiter_char><delimiter_char>
means that the actual keyword value starts with <delimiter_char>
For example: "|$COM||| Delimiter starts my comment|"
(| is the <delimiter_char> in my examples)
2) <delimiter_char>x
where x is not a <delimiter_char> means that the vendor broke the
standard and saved a keyword with an empty value.
For example: "|$COM||$CYT|Partec PAS|"
I know, this only works assuming that there are no keyword names that
would include the <delimiter_char> as part of the name. I believe that
this is a safe assumption after having seen many many FCS files. In
the
example, this "relaxed" interpretation would mean that there are two
keywords, "$COM" (empty value) and "$CYT" (value "Partec PAS"). A
strict
FCS compatible implementation reads this as a single keyword named
"$COM|$CYT" with a value of "Partec PAS".
3) x
where x is not a <delimiter_char> simply means that the keyword value
is
starting with character x.
For example: "|$COM|My comment|"
It goes down to the question whether it is a good practice to read
broken files, which is essentially sending a message to vendors saying
that it is OK to generate broken files. I hate that message but at the
end, I think it is even more important to make users happy, which is
why
I would argue to change flowCore and make it more relaxed as
described.
FlowJo and some other tools took this path, which is greatly
appreciated
by their users.
Best regard,
Josef
Btw. A minor correction to Kieran's note from another email: I have
been
only involved in the FCS 3.1 revision but haven't been around in the
90s
when the FCS 3.0 standard was developed :-)
On 12-06-15 03:00 AM, bioconductor-request at r-project.org wrote:
> Date: Thu, 14 Jun 2012 13:32:37 -0700
> From: "Jiang, Mike"<wjiang2 at="" fhcrc.org="">
> To:<bioconductor at="" r-project.org="">
> Subject: Re: [BioC] [Bioc-devel] flowCore 1.22.0 broken for some FCS
> files (which it previously read without errors)
> Message-ID:<d780eac3ada31f488bca74eccd5b717e0875fb79 at="" isis.fhcrc.org="">
> Content-Type: text/plain
>
> Kieran,
>
> I looked at your FCS, it has empty keyword value which does not
conform to FCS 3.0 standard:
> "3.2.9 Keywords and keyword values must have lengths greater than
zero. "(http://murphylab.cbi.cmu.edu/FCSAPI/FCS3.html).
>
> Particularly, this occurs at $ENDSTEXT keyword-value pairs
:"\\$ENDSTEXT\\\\$ETIM..."
> Which is "byte offset to end of the supplemental TEXT segment" and
really shouldn't be empty (normally it is put as "0")
>
> And "\\" is used as delimiter here, FCS 3.0 allows delimiter appears
in the keyword value or keyword name as long as it is " immediately
followed by a second delimiter". So the characters "\\\\" after
"$ENDSTEXT" keyword is misunderstood as part of "$ETIM" by the parser
here, which further messed up the parsing of subsequent string. That
is why the parser is reporting error.
>
> Originally,flowCore did not handle this delimiter issue properly. It
might read FCS successfully with the incorrect keyword values without
notifying the user. Now,we thought it may be helpful to throw the
error and let user know the issue with the TEXT segment of FCS.
>
> I have attached the TEXT Segment of your FCS file.
>
> Let me know if you have questions.
>
> Thanks,
> Mike
>> >From: Kieran O'Neill<koneill at="" bccrc.ca="">
>> >Subject: [Bioc-devel] flowCore 1.22.0 broken for some FCS files
(which it previously read without errors)
>> >Date: June 13, 2012 3:53:17 PM PDT
>> >To:bioc-devel at r-project.org
>> >Hi all
>> >
>> >I just recently came back to a project I was previously working
on,
>> >and found that the most recent version of flowCore, 1.22.0, no
longer
>> >reads some of my FCS files (those generated by one instrument in
>> >particular).
>> >
>> >The error it gives is:
>> >
>> >Error in fcs_text_parse(txt) : ERROR! no end found
>> >
>> >Previous versions of flowCore had no trouble reading these files,
and
>> >the current version seems to read most other FCS files I have from
>> >other instruments. However, since parsing FCS files into something
>> >usable in R is probably the most important functionality in the
>> >package, having it broken is rather bad.
>> >
>> >It is also quite frustrating for me, in that no previous version
of
>> >flowCore works in the current version of R (2.15.0), so I would
need
>> >to downgrade the whole of R in order to downgrade to a working
version
>> >of flowCore to analyse these files.
>> >
>> >I would be happy to send a sample file for debugging if needed.
>> >
>> >Thanks,
>> >Kieran
>> >
>> >_______________________________________________
>> >Bioc-devel at r-project.org mailing list
>> >https://stat.ethz.ch/mailman/listinfo/bioc-devel
--
Josef Spidlen, Ph.D.
Terry Fox Laboratory, BC Cancer Agency
675 West 10th Avenue, V5Z 1L3 Vancouver, BC, Canada
Tel: +1 (604) 675-8000 x 7755
http://www.terryfoxlab.ca/people/rbrinkman/josef.aspx