GEOquery: Diagnosing SOFT file error
Entering edit mode
Last seen 3.9 years ago

I've been successfully using GEOquery to download GEO datasets. I came across one (GSE56046) whose SOFT file seemed too small for the number of samples, and I get a lot of downstream errors (see below). 


Before contacting the submitter, can anyone spot what the issue is?  


This command downloads 1203 sample data surprisingly fast and results in a 61 Mb dataset, which is surprisingly small.

gse = getGEO("GSE56046",GSEMatrix=F)


This also results in warning messages:

Warning messages:

1: In readLines(con, n = chunksize) :

  seek on a gzfile connection returned an internal error


There is also something fishy about the S4 object:


Error in gse[[1]] : this S4 class is not subsettable



geoquery geo data geo soft files • 591 views
Entering edit mode
Last seen 2 hours ago
United States

The error you are getting has to do with specifying GSEMatrix=F. If you use the default instead, you will get a list of ExpressionSets. Unfortunately, though, in this case, that will not help much to get access to the actual expression values. This GSE has only the metadata attached to the GSE files. To get the actual data, you'll need to download the supplemental files and then parse them manually.


Combined with getting the GSE metadata using:

gse = getGEO("GSE56046")

you may be able to construct an ExpressionSet or SummarizedExperiment. Unfortunatly, when GEO has records like this that do not supply data as part of the record but, instead, as supplemental files, GEOquery does not attempt to "guess" what the submitters had in mind.


Login before adding your answer.

Traffic: 224 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6