Incomplete dataset download from GEO
1
0
Entering edit mode
pathreskoo • 0
@pathreskoo-7097
Last seen 9.4 years ago
Taiwan

I download dataset,

geoq <- getGEO("GSE9514")

At the end, the warning message shown,

Warning message:
In readLines(fname) :
  incomplete final line found on '/var/folders/k2/kdrnsbws5gz8vrt83yjmlbdm0000gn/T//Rtmp6T1Fwv/GSE9514_series_matrix.txt.gz'

I the dataset downloaded is 9.2 MB, but the teaching video (in edX PH525x Data Analysis for Genomics) shown that it is 9.9 MB

Besides, when look into the dataset using dim(e) (after e <- geo[[1]]), the features of my dataset shows that it is 4370 only, but the video shows 9335. Besides, I use pData(e)$data_row_count the features in each column is 9335. Apparently, I the dataset I downloaded is truncated.  

How can I solve this problem?


> sessionInfo()
R version 3.0.2 (2013-09-25)
Platform: x86_64-apple-darwin10.8.0 (64-bit)

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods  
[8] base     

other attached packages:
[1] GEOquery_2.28.0      GenomicRanges_1.14.4 XVector_0.2.0       
[4] IRanges_1.20.7       Biobase_2.22.0       BiocGenerics_0.8.0  

loaded via a namespace (and not attached):
[1] RCurl_1.95-4.3 stats4_3.0.2   tools_3.0.2    XML_3.95-0.2  

geoquery • 1.2k views
ADD COMMENT
0
Entering edit mode
@sean-davis-490
Last seen 3 months ago
United States

This works for me.  I'd suggest retrying the download. Note that GEOquery will automatically use the locally cached version after the first download unless you restart R, remove the cached version, or specify a new destdir. 

​> geoq <- getGEO("GSE9514")[[1]]
ftp://ftp.ncbi.nlm.nih.gov/geo/series/GSE9nnn/GSE9514/matrix/
Found 1 file(s)
GSE9514_series_matrix.txt.gz
trying URL 'ftp://ftp.ncbi.nlm.nih.gov/geo/series/GSE9nnn/GSE9514/matrix/GSE9514_series_matrix.txt.gz'
ftp data connection made, file length 387103 bytes
opened URL
==================================================
downloaded 378 KB

File stored at:
/var/folders/21/b_rp6qyj1_b1j5cp8qby0tnr0000gn/T//RtmpS1kI6A/GPL90.soft

> geoq
ExpressionSet (storageMode: lockedEnvironment)
assayData: 9335 features, 8 samples
  element names: exprs
protocolData: none
phenoData
  sampleNames: GSM241146 GSM241147 ... GSM241153 (8 total)
  varLabels: title geo_accession ... data_row_count (31 total)
  varMetadata: labelDescription
featureData
  featureNames: 10000_at 10001_at ... AFFX-YFL039CM_at (9335 total)
  fvarLabels: ID ORF ... Gene Ontology Molecular Function (17 total)
  fvarMetadata: Column Description labelDescription
experimentData: use 'experimentData(object)'
Annotation: GPL90 
ADD COMMENT

Login before adding your answer.

Traffic: 594 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6