Error while reading ArrayExpress dataset
Entering edit mode
Last seen 2.1 years ago
United Kingdom


I am trying to access a dataset from ArrayExpress with the ID "E-GEOD-33675", but I keep getting an error right after it tries to read pheno data from the sdrf file. Is this because I am doing something wrong (I am new to Bioconductor) or is it an issue with the dataset itself (and if so, how can I fix this)? The error message is "Error in .subset2(x, i, exact = exact) : subscript out of bounds", shown below in context. I seem to be able to fetch a few other datasets that I tried without any problem. Any help/suggestions/comments much appreciated. Thanks for your time.

> AEdata <- ArrayExpress("E-GEOD-33675")
trying URL ''
Content type 'text/plain' length 24270 bytes (23 KB)
downloaded 23 KB

trying URL ''
Content type 'text/plain' length 20615 bytes (20 KB)
downloaded 20 KB

trying URL ''
Content type 'text/plain' length 6945 bytes
downloaded 6945 bytes

Copying raw data files

trying URL ''
Content type 'application/zip' length 2769721 bytes (2.6 MB)
downloaded 2.6 MB

Unpacking data files
ArrayExpress: Reading pheno data from SDRF
Error in .subset2(x, i, exact = exact) : subscript out of bounds


Srinivasa Rao

arrayexpress • 1.3k views
Entering edit mode
Last seen 20 hours ago
United States

The error occurs in the function readPhenoData(), where it doesn't seem to correctly account for the fact that these are two-color arrays when it tries to assign row.names to the phenoData object.

Anyway, this is on GEO as well.

> library(GEOquery)
Setting options('download.file.method.GEOquery'='curl')
> dat <- getGEO("GSE33675")
Found 1 file(s)
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 34040  100 34040    0     0  23156      0  0:00:01  0:00:01 --:--:-- 23156
File stored at:
> dat[[1]]
ExpressionSet (storageMode: lockedEnvironment)
assayData: 905 features, 28 samples
  element names: exprs
protocolData: none
  sampleNames: GSM832644 GSM832645 ... GSM832671 (28 total)
  varLabels: title geo_accession ... data_row_count (30 total)
  varMetadata: labelDescription
  featureNames: a-PUC2MM2d a-PUC2PM ... PUC2PM-20B (905 total)
  fvarLabels: ID miRNA_ID SPOT_ID
  fvarMetadata: Column Description labelDescription
experimentData: use 'experimentData(object)'
Annotation: GPL14799

And if you care to process the data yourself, you can use either getGEOSuppFiles() from GEOquery, or getAE() from ArrayExpress, and then use e.g., limma to read in and process the data by hand.





Login before adding your answer.

Traffic: 498 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6