read.FCS fails on BioRad FCS files
6
0
Entering edit mode
Robert Baer ▴ 70
@robert-baer-4660
Last seen 18 months ago
United States

We recently got a BioRad S3 cell sorter which saves files in .fcs format.  I used the following R code in R 3.2.2 to try to read this file:

library('flowCore')  #version 1.11.20
# Set directory
setwd('R:/Publishing/Baer Lab/FACS/2015-09-18')

x = read.FCS(filename = "CG sort.fcs" , transformation = FALSE)

------------------------------------------

The result is an error that reads as follows:

Error in readFCSdata(con, offsets, txt, transformation, which.lines, scale,  : 
  $PnR is larger than the integer limit: 4294967296
In addition: Warning message:
In readFCSdata(con, offsets, txt, transformation, which.lines, scale,  :
  NAs introduced by coercion to integer range

-------------------------------------------

For what it's worth, the FCS version (based on opening in browser :()  appears to be version 3.1.

 

It is not clear to me what might be happening.  Is this a coding problem?  a flowCore problem?  an FCS integrity or version problem?

I would appreciate any insight into how to debug or better yet overcome this situation.  If this is a lost cause are there other packages for working with FCS data?

 

Thanks,

 

Rob Baer

> R.Version()

$platform
[1] "x86_64-w64-mingw32"

$arch
[1] "x86_64"

$os
[1] "mingw32"

$system
[1] "x86_64, mingw32"

$status
[1] ""

$major
[1] "3"

$minor
[1] "2.2"

$year
[1] "2015"

$month
[1] "08"

$day
[1] "14"

$`svn rev`
[1] "69053"

$language
[1] "R"

$version.string
[1] "R version 3.2.2 (2015-08-14)"

$nickname
[1] "Fire Safety"

flowCore • 1.6k views
ADD COMMENT
1
Entering edit mode
Jiang, Mike ★ 1.3k
@jiang-mike-4886
Last seen 8 weeks ago
(Private Address)

As the error states, this FCS TEXT section uses the Integer that is too big for R (type '.Machine$integer.max' to see the maximum value of integer R can represent).

Try to instruct the sorter to save FCS in 'numeric' (or 'float/double') instead of 'integer'.

ADD COMMENT
0
Entering edit mode

Thanks.  That is at least a start toward a possible resolution.  

I don't know yet what the save options are on the Biorad (ProSort) software, but I'll check it out .  You seem to be telling me that the solution lies on the export side, not the import side.  This implies that the fcs format saves variable typing as well as variable values.  I guess that I'll have to delve deeper in the specifics of fcs specifications too :-(

FYI on Windows 8.1 on an I7 processor with 64-bit R 3.2.2, it seems my integers are indeed too small:

> .Machine$integer.max
[1] 2147483647

 

 

ADD REPLY
0
Entering edit mode

Right, I suppose Biorad should be able to allow you select the value type (between 'int' and 'float') when you are exporting/saving your FCS.

ADD REPLY
1
Entering edit mode
Jiang, Mike ★ 1.3k
@jiang-mike-4886
Last seen 8 weeks ago
(Private Address)

Robert,

I've pushed the fix to flowCore 1.34.11.  It supports the unit32 that is larger than 2^31-1. Let me know if it works.

Mike

ADD COMMENT
0
Entering edit mode
Robert Baer ▴ 70
@robert-baer-4660
Last seen 18 months ago
United States

Follow-up

The ProSort software does not support float/double saving.

I have investigated the headers of these ProSort FCS files, and can provide more detail on what I think is happening. flowCore and read.FCS() are apparently not having difficulty reading data, but rather are having difficulty reading the header.  As per the FCS3.1 spec, integers are "unsigned integers" not signed integers. Biorad ProSort therefore specifies the max possible value in PnR as 2^32 not 2^31-1.  This value that seems to be causing the error is in the header section in three places and it appears as if it is a conversion of HEADER ASCII information (unsigned integer) into an internal signed integer representation that might be causing the issue.  

Is there a reason that this ascii is not converted to numeric rather than integer?   I'm wondering why the datatype used by an export program should affect the internal representation of the data by R.  Couldn't that always be numeric without violating the FCS3.1 specification or altering user expectations?

# ====================================================================================================
#  From the header section of Biorad ProSort FCS file (ver 3.1)
#
# $DATATYPE|I|
#   $MODE|L|
#   ...
# $P1N|TIME_MSW|$P1S|TIME-|$P1B|32|$P1R|4294967296|$P1E|0,0|$P1G|0|$P1CALIBATION|41974.00,|
# $P2N|TIME_LSW|$P2S|TIME-|$P2B|32|$P2R|4294967296|$P2E|0,0|$P2G|0|$P2CALIBATION|41974.00,|
#   ...
# $P21N|SORT|$P21S|SORT|$P21B|32|$P21R|4294967296|$P21E|0,0|$P21G|0|
#  ==================================================================================================== 
# R code to find  out what is the situation R-wise

library('flowCore')
x2 = read.FCS(filename = 'CG FL2 highest peak partial save.fcs')
# Error in readFCSdata(con, offsets, txt, transformation, which.lines, scale,  : 
#   $PnR is larger than the integer limit: 4294967296
# In addition: Warning message:
# In readFCSdata(con, offsets, txt, transformation, which.lines, scale,  :
#   NAs introduced by coercion to integer range

as.integer(.Machine$integer.max)
# [1] 2147483647
mode(as.integer(.Machine$integer.max))
# [1] "numeric"
2^31
# [1] 2147483648
# Now shift bits left 1
as.integer(.Machine$integer.max)*2
# [1] 4294967294
2^32
# [1] 4294967296

#============================================================================================================
#  From Spieden et al. Cytometry A. 2010 Jan;77(1):97-100. doi: 10.1002/cyto.a.20825.
#  see more directly: http://isac-net.org/PDFS/90/9090600d-19be-460d-83fc-f8a8b004e0f9.pdf
# Specification Excerpts:
# "All keyword vales are encoded in UTF-8."
#
# "$PnR "Range for parameter nummber of n. p 10/34 of .pdf"
#
# "$DATATYPE/I/ means that the events are written as unsigned binary integers."  p 14/34 of .pdf
#                                                     ======                  ========
#===========================================================================================================

 

 

ADD COMMENT
0
Entering edit mode
Jiang, Mike ★ 1.3k
@jiang-mike-4886
Last seen 8 weeks ago
(Private Address)

First of all, the maximum uint for 32 bit should be 2^32-1. Secondly,  since R doesn't support unsigned int and the max value has to be  <= 2^31 - 1.  Given the 'PnR'  is already out of range,  it is right for the API to be concerned about the data value stored in FCS. (In fact we could read the data value incorrectly (as negative values) if some of these are indeed out of range).

So if the data value is guaranteed to be within the range c(0, 2^32-1),  ProSort should be instructed to fill the 'PnR' with the correct range instead of the current '2^32'

 

ADD COMMENT
0
Entering edit mode
Robert Baer ▴ 70
@robert-baer-4660
Last seen 18 months ago
United States

Thanks, again.  I have contacted ProSort support to see what I can get accomplished.  Your patience with my questions greatly appreciated.  I have referenced this discussion for their edification.

 

 

ADD COMMENT
0
Entering edit mode
Robert Baer ▴ 70
@robert-baer-4660
Last seen 18 months ago
United States

Beautiful!

I've now installed  flowCore Version: 1.35.12 and the original  BioRad S3 cell sorter file appears to read seamlessly.   Thanks for making your package so accessible.  We are grateful for your contributions.  

Biorad tech support reports to me that they too are now able to read files with flowCore(), and that they have found a "work-around  for this edge condition".  I have not had an opportunity to try their update yet.

To be clear, the file I verified as readable with flowCore() 1.35.12 was created previously (before any software changes by Biorad), meaning your fix alone helped me.  Thanks for providing the robust fix. 

ADD COMMENT

Login before adding your answer.

Traffic: 234 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6