Dear Community,
based on the very large size of raw data of a specific affymetrix HTA 2.0 dataset in GEO (with GSE88884), i used the following small code chunk to download the processed data:
library(GEOquery)
gseList = getGEO("GSE88884")
https://ftp.ncbi.nlm.nih.gov/geo/series/GSE88nnn/GSE88884/matrix/
OK
Found 1 file(s)
GSE88884_series_matrix.txt.gz
trying URL 'https://ftp.ncbi.nlm.nih.gov/geo/series/GSE88nnn/GSE88884/matrix/GSE88884_series_matrix.txt.gz'
Content type 'application/x-gzip' length 103966 bytes (101 KB)
downloaded 101 KB
File stored at:
C:\Users\EFSTAT~1\AppData\Local\Temp\RtmpEPUvTf/GPL17586.soft
Warning message:
In read.table(file = file, header = header, sep = sep, quote = quote, :
not all columns named in 'colClasses' exist
gse
ExpressionSet (storageMode: lockedEnvironment)
assayData: 0 features, 1820 samples
element names: exprs
protocolData: none
phenoData
sampleNames: GSM2350873 GSM2350874 ... GSM2352692 (1820 total)
varLabels: title geo_accession ... relation (48 total)
varMetadata: labelDescription
featureData
featureNames:
fvarLabels: ID probeset_id ... SPOT_ID (15 total)
fvarMetadata: Column Description labelDescription
experimentData: use 'experimentData(object)'
Annotation: GPL17586
gse = gseList[[1]]
head(exprs(gse))
GSM2350873 GSM2350874 GSM2350875 GSM2350876 GSM2350877 GSM2350878 GSM2350879 GSM2350880
GSM2350881 GSM2350882 GSM2350883 GSM2350884 GSM2350885 GSM2350886 GSM2350887 GSM2350888
GSM2350889 GSM2350890 GSM2350891 GSM2350892 GSM2350893 GSM2350894 GSM2350895 GSM2350896
GSM2350897 GSM2350898 GSM2350899 GSM2350900 GSM2350901 GSM2350902.....
sessionInfo()
R version 3.3.1 (2016-06-21)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)
locale:
[1] LC_COLLATE=Greek_Greece.1253 LC_CTYPE=Greek_Greece.1253 LC_MONETARY=Greek_Greece.1253
[4] LC_NUMERIC=C LC_TIME=Greek_Greece.1253
attached base packages:
[1] parallel stats graphics grDevices utils datasets methods base
other attached packages:
[1] GEOquery_2.40.0 Biobase_2.34.0 BiocGenerics_0.20.0
loaded via a namespace (and not attached):
[1] httr_1.2.1 R6_2.2.0 tools_3.3.1 RCurl_1.95-4.8 knitr_1.15.1 bitops_1.0-6
[7] XML_3.98-1.5
So what about this weird problem ? with no genes/probesets and also no expression appear ?
Dear Sean,
thank you for your comprehensive answer !! i usually download raw files, but this time due to the very large size of the raw data, i saw "naively" from this link (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE88884) the abbreviation GSE88884_ILLUMINATE1and2_SLEbaselineVsHealthy_preprocessed.txt.gz
and i thought with getGEO() i would download the processed data, which lead my to the above problem.