Limma: extend reading raw Illumina files, with Gene Omnbus Expression formats
1
0
Entering edit mode
@5de73a99
Last seen 3.2 years ago

Enter the body of text here

Code should be placed in three backticks as shown below

# get raw data using GEOquery package
GEOquery::getGEOSuppFiles("GSE22247", makeDirectory=TRUE)
untar (tarfile = "./GSE22247/GSE22247_RAW.tar", exdir = "./GSE22247")

# 2 files gathered : RAW.tar containing raw ILLUMINA expression for all samples
# and a .bgx/gz file containing probes information .

illumina_GSE22247 <- limma::read.ilmn(files = "GSE22247_non-normalized_data.txt.gz", path = "./GSE22247") # doesn't work
Error in readGenericHeader(fname, columns = expr, sep = sep) : 
  Specified column headings not found in file
bgx_GSE22247 <- illuminaio::readBGX( "./GSE22247/GPL6947_HumanHT-12_V3_0_R1_11283641_A.bgx.gz"  # does work

illumina_GSE22247 <- limma::read.idat(idatfiles = "./GSE22247/GSE22247_non-normalized_data.txt.gz",
                                                                 bgxfile = "./GSE22247/GPL6947_HumanHT-12_V3_0_R1_11283641_A.bgx.gz") # doesn't work logically
Reading manifest file ./GSE22247/GPL6947_HumanHT-12_V3_0_R1_11283641_A.bgx.gz ... Done
     ./GSE22247/GSE22247_non-normalized_data.txt.gz ... Error in illuminaio::readIDAT(idatfiles[j]) : 
  Cannot read IDAT file. File format error. Unknown magic: # Th
sessionInfo( )

R version 4.0.2 (2020-06-22)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)

Matrix products: default
BLAS:   /softhpc/R/4.0.2/lib64/R/lib/libRblas.so
LAPACK: /softhpc/R/4.0.2/lib64/R/lib/libRlapack.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

it would be great to add a feature to limma::read.ilmn (or maybe another function) as general probes files being separated from control probes files aren't that common, while projets such as GSE22247, avalaible on GEOquery, with only one raw expression file for all samples, and another one for bgx (probes annotation) are quite common.

beadarray GSE22247 limma GEOquery IlluminaChip • 936 views
ADD COMMENT
3
Entering edit mode
@gordon-smyth
Last seen 3 hours ago
WEHI, Melbourne, Australia

It is straightforward to read the expression values using limma, but you have specify some arguments. limma can't anticipate entirely what will be in Illumina files on GEO because there is no standardization regarding what people put in data files uploaded to GEO. You could read this particular data file by

x <- read.ilmn("GSE22247_non-normalized_data.txt.gz",probeid="ID_REF",expr="SAMPLE ",other=NULL)
Reading file GSE22247_non-normalized_data.txt.gz ... ...

or alternatively you could just use the standard R read function:

> x <- read.delim("GSE22247_non-normalized_data.txt.gz",skip=4,row.names=1)
> head(x)
              SAMPLE.1  SAMPLE.2  SAMPLE.3  SAMPLE.4  SAMPLE.5  SAMPLE.6
ILMN_1802380 777.23058 926.97896 840.97853 876.01468 734.75489 760.25903
ILMN_1893287  92.23149  98.52583  91.20370  90.50574  98.44884  95.50648
ILMN_1736104  96.12420  98.84145  99.58869  81.52061  90.06423  89.32391
ILMN_1792389 100.70701  93.31579  83.88396  95.93778 100.07544  91.27975
ILMN_1854015 118.74662 135.72634 144.12319 132.67514 124.95101 116.79923
ILMN_1904757 118.53504 103.33229 113.08138 101.46785  99.92867 107.17438
              SAMPLE.7  SAMPLE.8  SAMPLE.9 SAMPLE.10 SAMPLE.11 SAMPLE.12
ILMN_1802380 828.14566 887.05053 837.30327 765.20943 795.32554 758.55978
ILMN_1893287  88.17477  83.90320  96.40731  78.44412  90.09104  78.96416
ILMN_1736104  99.04910  98.37767  91.54610  91.32053  75.04575  81.30322
ILMN_1792389  93.73408 102.13628  99.34333 103.33994 101.22303  89.63655
ILMN_1854015 107.82091 125.98380 127.82920 142.06466 114.14095 101.28099
ILMN_1904757  96.30861 104.44577  97.53158  93.76759 101.18880  96.99833
ADD COMMENT

Login before adding your answer.

Traffic: 491 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6