Entering edit mode
Mark Cowley
▴
910
@mark-cowley-2951
Last seen 10.3 years ago
Dear list,
I'd like to be able to parse Illumina gene expression IDAT files &
i've been playing with the crlmm:::readIDAT function, which is
designed to read Illumina Infinium IDAT files. This function dies on
about the 9th line or so because 'nFields' is a very large negative
number (see below). I'm trying to read in a MouseRef-
8_V2_0_R1_11278551_A.bgx.xml type of array, but would like to be able
to read all types of gene expression arrays.
Here is the output that I get
library(ff)
library(crlmm)
f <- "4687778079_A_Grn.idat"
debug(crlmm:::readIDAT)
crlmm:::readIDAT(f)
#<snip>
Browse[2]>
debug: fileSize <- file.info(idatFile)$size
Browse[2]>
debug: tempCon <- file(idatFile, "rb")
Browse[2]>
debug: prefixCheck <- readChar(tempCon, 4)
Browse[2]>
debug: if (prefixCheck != "IDAT") {
}
Browse[2]> prefixCheck
[1] "IDAT"
Browse[2]>
debug: NULL
Browse[2]>
debug: versionNumber <- readBin(tempCon, "integer", n = 1, size = 8,
endian = "little", signed = FALSE)
Browse[2]>
debug: nFields <- readBin(tempCon, "integer", n = 1, size = 4, endian
= "little",
signed = FALSE)
Browse[2]> versionNumber
[1] 1
Browse[2]>
debug: fields <- matrix(0, nFields, 3)
Browse[2]> nFields
[1] -1398219826
Browse[2]>
Error in matrix(0, nFields, 3) : invalid 'nrow' value (< 0)
I've also come across the illumina.py file within the glu-genetics
project at googlecode, which as far as I can tell is python code to
parse illumina arrays, based upon this crlmm code. Between crlmm's
code & the glu-genetics code, I gather that the readIDAT function only
reads IDAT version 3 files, whereas i'm pretty sure mine are IDAT
version 1 (as indicated by the versionNumber value above
I don't know whether Infinium IDAT's are indeed a different version to
gene expression IDAT's, but I was hoping someone could point me in the
right direction. Does anyone have a parser for generic IDAT files, or
does anyone know how to reverse engineer binary files?
cheers,
Mark
----------------------------------------------------------------------
Mark Cowley, PhD
Peter Wills Bioinformatics Centre
Garvan Institute of Medical Research
----------------------------------------------------------------------
sessionInfo()
R version 2.11.0 (2010-04-22)
x86_64-apple-darwin9.8.0
locale:
[1] en_AU.UTF-8/en_AU.UTF-8/C/C/en_AU.UTF-8/en_AU.UTF-8
attached base packages:
[1] tools stats graphics grDevices utils datasets
methods base
other attached packages:
[1] crlmm_1.6.2 oligoClasses_1.10.0 Biobase_2.8.0
ff_2.1-2 bit_1.1-4
loaded via a namespace (and not attached):
[1] affyio_1.16.0 annotate_1.26.0 AnnotationDbi_1.10.1
Biostrings_2.16.2
[5] DBI_0.2-5 ellipse_0.3-5 genefilter_1.30.0
IRanges_1.6.4
[9] mvtnorm_0.9-9 preprocessCore_1.10.0 RSQLite_0.9-0
splines_2.11.0
[13] survival_2.35-8 xtable_1.5-6
[[alternative HTML version deleted]]