bug limma / illuminaio? Otherwise small feature request...
1
0
Entering edit mode
Guido Hooiveld ★ 3.9k
@guido-hooiveld-2020
Last seen 3 hours ago
Wageningen University, Wageningen, the …

It seems I found a small bug in limma or illuminaio. If not, when the behavior below is as intended, please allow me to put forward a (small) feature request:

the possibility to also read compressed IDAT files (just like it is possible for compressed BGX files).

 

-> at the Gene Expression Omnibus (GEO), the (raw) data files that are made available are always compressed by GZIP. Being able to directly read these compressed files with limma (through illuminaio?) would make life slightly more comfortable. I noticed compressed BGX files could already be handled, but this seems not to be the case for IDAT files. Hence my question.

 

Thanks for considering.

Guido

 

# Example (from: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE80081)
# I manually downloaded the raw data file (GSE80081_RAW.tar), and extracted it.
# The result is a set of compressed files.

> library(limma)
>  dir()
[1] "GPL6887_MouseWG-6_V2_0_R0_11278593_A.bgx.gz"
[2] "GPL6887_MouseWG-6_V2_0_R3_11278593_A.txt.gz"
[3] "GSM2112545_9482914014_A_Grn.idat.gz"     
[4] "GSM2112546_9482914014_B_Grn.idat.gz"     
[5] "GSM2112547_9482914014_C_Grn.idat.gz"     
[6] "GSM2112548_9482914014_D_Grn.idat.gz"     
[7] "GSM2112549_9482914014_E_Grn.idat.gz"     
[8] "GSM2112550_9482914014_F_Grn.idat.gz"
> bgxfile = dir(pattern="bgx")
> idatfiles = dir(pattern="idat")
>
> x <- read.idat(idatfiles, bgxfile)
Reading manifest file GPL6887_MouseWG-6_V2_0_R0_11278593_A.bgx.gz ... Done

         GSM2112545_9482914014_A_Grn.idat.gz ... Error in dataChunks[[i]] : subscript out of bounds

> traceback()
4: strsplit(dataChunks[[i]], "\\\"")
3: readIDAT_enc(file)
2: illuminaio::readIDAT(idatfiles[j])
1: read.idat(idatfiles, bgxfile)
>

 

# After manually extracting the compressed IDAT files (only) it works fine.

> dir()
[1] "GPL6887_MouseWG-6_V2_0_R0_11278593_A.bgx.gz"
[2] "GSM2112545_9482914014_A_Grn.idat"          
[3] "GSM2112546_9482914014_B_Grn.idat"          
[4] "GSM2112547_9482914014_C_Grn.idat"          
[5] "GSM2112548_9482914014_D_Grn.idat"          
[6] "GSM2112549_9482914014_E_Grn.idat"          
[7] "GSM2112550_9482914014_F_Grn.idat"          
> bgxfile = dir(pattern="bgx")
> idatfiles = dir(pattern="idat")
>
> x <- read.idat(idatfiles, bgxfile)
Reading manifest file GPL6887_MouseWG-6_V2_0_R0_11278593_A.bgx.gz ... Done

         GSM2112545_9482914014_A_Grn.idat ... Done
         GSM2112546_9482914014_B_Grn.idat ... Done
         GSM2112547_9482914014_C_Grn.idat ... Done
         GSM2112548_9482914014_D_Grn.idat ... Done
         GSM2112549_9482914014_E_Grn.idat ... Done
         GSM2112550_9482914014_F_Grn.idat ... Done
Finished reading data.

>
> x.norm <- neqc(x)
>

 

> sessionInfo()
R version 3.4.0 Patched (2017-05-10 r72670)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252
[2] LC_CTYPE=English_United States.1252  
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                         
[5] LC_TIME=English_United States.1252   

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base    

other attached packages:
[1] limma_3.32.2

loaded via a namespace (and not attached):
[1] compiler_3.4.0    base64_2.0        illuminaio_0.18.0 openssl_0.9.6   
>

 

limma illuminaio idat gzip • 1.8k views
ADD COMMENT
0
Entering edit mode
Matthew Ritchie ▴ 1000
@matthew-ritchie-650
Last seen 20 months ago
Australia

Thanks for the suggestion Guido - will look into this.

ADD COMMENT

Login before adding your answer.

Traffic: 661 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6