Entering edit mode
jianwang2010
▴
10
@jianwang2010-7648
Last seen 10.5 years ago
Dear GEOquery users/developers:
I encountered an error while using GEOquery package to download GSExxx from GEO database. Googled and saw similar error messages posted by others but no suitable solutions seem available.
Any help is greatly appreciated.
This is the R code:
===================
library(Biobase)
library(GEOquery)
gse <- "GSE49279"
gset <- try(getGEO(gse, GSEMatrix =TRUE, getGPL=FALSE))
The error message:
=================
Setting options('download.file.method.GEOquery'='curl')
ftp://ftp.ncbi.nlm.nih.gov/geo/series/GSE9nnn/GSE9105/matrix/
Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, :
line 1 did not have 6 elements
ftp://ftp.ncbi.nlm.nih.gov/geo/series/GSE49nnn/GSE49279/matrix/
Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, :
line 1 did not have 6 elements
Here is my sessionInfo()
R version 3.1.2 (2014-10-31)
Platform: x86_64-unknown-linux-gnu (64-bit)
locale:
[1] LC_CTYPE=en_US LC_NUMERIC=C LC_TIME=en_US
[4] LC_COLLATE=en_US LC_MONETARY=en_US LC_MESSAGES=en_US
[7] LC_PAPER=en_US LC_NAME=C LC_ADDRESS=C
[10] LC_TELEPHONE=C LC_MEASUREMENT=en_US LC_IDENTIFICATION=C
attached base packages:
[1] parallel stats graphics grDevices utils datasets methods
[8] base
other attached packages:
[1] GEOquery_2.32.0 RCurl_1.95-4.3 Biobase_2.26.0
[4] BiocGenerics_0.12.0 nza_3.0.1 rpart_4.1-8
[7] arules_1.1-5 Matrix_1.1-4 XML_3.98-1.1
[10] ca_0.55 MASS_7.3-35 e1071_1.6-4
[13] tree_1.0-35 nzr_3.0.1 bitops_1.0-6
[16] RODBC_1.3-10

This GSE Series Matrix file has no data in it, so using getGEO on the GSEMatrix file will not get you anything except the annotation.
It might be because that is an RNA-Seq analysis. There doesn't appear to be any data in the matrix.txt.gz file - it just has pointers to the SRA.
This is handled by recent versions of GEOquery and shouldn't result in an error. I need to check on when I made that change. In any case, an update to the newest version of R is likely a good idea.
> z <- getGEO("GSE49279") ftp://ftp.ncbi.nlm.nih.gov/geo/series/GSE49nnn/GSE49279/matrix/ Found 1 file(s) GSE49279_series_matrix.txt.gz % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 7275 100 7275 0 0 4959 0 0:00:01 0:00:01 --:--:-- 4962 File stored at: /data3/tmp/RtmpoaoxiB/GPL11154.soft > head(exprs(z[[1]])) GSM1196555 GSM1196556 GSM1196557 GSM1196558 GSM1196559 GSM1196560 GSM1196561 GSM1196562 GSM1196563 GSM1196564 GSM1196565 GSM1196566 GSM1196567 GSM1196568 GSM1196569 GSM1196570 GSM1196571 GSM1196572 GSM1196573 GSM1196574 GSM1196575 GSM1196576 GSM1196577 GSM1196578 GSM1196579 GSM1196580 GSM1196581 GSM1196582 GSM1196583 GSM1196584 GSM1196585 GSM1196586 GSM1196587 GSM1196588 GSM1196589 GSM1196590 GSM1196591 GSM1196592 GSM1196593 GSM1196594 GSM1196595 GSM1196596 GSM1196597 GSM1196598 GSM1196599 GSM1196600 GSM1196601 GSM1196602 GSM1196603 GSM1196604 GSM1196605 GSM1196606 GSM1196607 GSM1196608 GSM1196609 GSM1196610 GSM1196611 GSM1196612 GSM1196613 GSM1196614 GSM1196615 GSM1196616 GSM1196617 GSM1196618 GSM1196619 GSM1196620 GSM1196621 GSM1196622 GSM1196623 GSM1196624 GSM1196625 GSM1196626 GSM1196627 GSM1196628 GSM1196629 GSM1196630 GSM1196631 GSM1196632 > sessionInfo() R version 3.2.0 (2015-04-16) Platform: x86_64-unknown-linux-gnu (64-bit) Running under: Debian GNU/Linux 8 (jessie) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] parallel stats graphics grDevices utils datasets methods [8] base other attached packages: [1] GEOquery_2.34.0 Biobase_2.28.0 BiocGenerics_0.14.0 loaded via a namespace (and not attached): [1] tools_3.2.0 RCurl_1.95-4.5 bitops_1.0-6 XML_3.98-1.1 >> z2 <- getGEOSuppFiles("GSE49279") ftp://ftp.ncbi.nlm.nih.gov/geo/series/GSE49nnn/GSE49279/suppl/ % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 117k 100 117k 0 0 62190 0 0:00:01 0:00:01 --:--:-- 62190 > zz <- read.delim("GSE49279/GSE49279_miRNAseq-processed_data.txt.gz") > head(zz) miRNA ACC1 ACC2 ACC4 ACC5 ACC6 ACC8 ACC9 ACC10 ACC11 1 hsa-let-7a-2-3p 0 4 0 6 3 8 0 0 2 2 hsa-let-7a-3p 409 211 42 709 276 235 236 154 141 3 hsa-let-7a-5p 319828 296528 81039 347538 154576 161409 79395 120927 97828 4 hsa-let-7b-3p 1095 496 164 482 635 573 198 184 200 5 hsa-let-7b-5p 37840 46670 9304 78156 16937 14733 7407 5467 10759 6 hsa-let-7c 4326 17294 8150 16537 7958 6738 1377 4102 1409