GEOquery: Diagnosing SOFT file error
1
1
Entering edit mode
@supermanrocketboy-11690
Last seen 3.4 years ago

I've been successfully using GEOquery to download GEO datasets. I came across one (GSE56046) whose SOFT file seemed too small for the number of samples, and I get a lot of downstream errors (see below).

Before contacting the submitter, can anyone spot what the issue is?

This command downloads 1203 sample data surprisingly fast and results in a 61 Mb dataset, which is surprisingly small.

gse = getGEO("GSE56046",GSEMatrix=F)

This also results in warning messages:

Warning messages:

1: In readLines(con, n = chunksize) :

seek on a gzfile connection returned an internal error

There is also something fishy about the S4 object:

dim(pData(gse[[1]]))

Error in gse[[1]] : this S4 class is not subsettable

geoquery geo data geo soft files • 465 views
0
Entering edit mode
@sean-davis-490
Last seen 7 days ago
United States

The error you are getting has to do with specifying GSEMatrix=F. If you use the default instead, you will get a list of ExpressionSets. Unfortunately, though, in this case, that will not help much to get access to the actual expression values. This GSE has only the metadata attached to the GSE files. To get the actual data, you'll need to download the supplemental files and then parse them manually.

getGEOSuppFiles("GSE56046")


Combined with getting the GSE metadata using:

gse = getGEO("GSE56046")


you may be able to construct an ExpressionSet or SummarizedExperiment. Unfortunatly, when GEO has records like this that do not supply data as part of the record but, instead, as supplemental files, GEOquery does not attempt to "guess" what the submitters had in mind.