GEOquery Error
1
0
Entering edit mode
ying chen ▴ 340
@ying-chen-5085
Last seen 10.2 years ago
Hi, I want to use GEOquery package to get the raw files of a lot GEO datasets at once ( > files <- sapply(mydata$GSE_ID, getGEOSuppFiles) ), but I got the following error message when I did a simple test run. Any suggestion? Thanks a lot! Ying > library(GEOquery) Loading required package: Biobase Welcome to Bioconductor Vignettes contain introductory material. To view, type 'browseVignettes()'. To cite Bioconductor, see 'citation("Biobase")' and for packages 'citation("pkgname")'. Setting options('download.file.method.GEOquery'='curl') > files <- getGEOSuppFiles("GSE23720") [1] "ftp://ftp.ncbi.nih.gov/pub/geo/DATA/supplementary/series/GSE23720/" Error in function (type, msg, asError = TRUE) : Server denied you to change to the given directory > sessionInfo() R version 2.14.1 (2011-12-22) Platform: x86_64-pc-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=C LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] GEOquery_2.20.8 Biobase_2.14.0 loaded via a namespace (and not attached): [1] RCurl_1.9-5 XML_3.9-4 > [[alternative HTML version deleted]]
GEOquery GEOquery • 3.3k views
ADD COMMENT
0
Entering edit mode
@sean-davis-490
Last seen 3 months ago
United States
On Thu, Feb 2, 2012 at 12:38 PM, ying chen <ying_chen at="" live.com=""> wrote: > > > > Hi, > > I want to use GEOquery package to get the raw files of a lot GEO datasets at once ( > files <- sapply(mydata$GSE_ID, getGEOSuppFiles) ), but I got the following error message when I did a simple test run. Any suggestion? > Hi, Ying. This is not a GEOquery issue. The directory housing the data is not on the FTP site. NCBI GEO periodically rebuilds stuff on the site. That might be occurring now. I'd suggest emailing NCBI GEO directly if you are in a hurry. Alternatively, wait an hour or two to see if the problem is resolved. Sean >> library(GEOquery) > Loading required package: Biobase > Welcome to Bioconductor > ?Vignettes contain introductory material. To view, type > ?'browseVignettes()'. To cite Bioconductor, see > ?'citation("Biobase")' and for packages 'citation("pkgname")'. > Setting options('download.file.method.GEOquery'='curl') >> files <- getGEOSuppFiles("GSE23720") > [1] "ftp://ftp.ncbi.nih.gov/pub/geo/DATA/supplementary/series/GSE23720/" > Error in function (type, msg, asError = TRUE) ?: > ?Server denied you to change to the given directory >> sessionInfo() > R version 2.14.1 (2011-12-22) > Platform: x86_64-pc-linux-gnu (64-bit) > locale: > ?[1] LC_CTYPE=en_US.UTF-8 ? ? ? LC_NUMERIC=C > ?[3] LC_TIME=en_US.UTF-8 ? ? ? ?LC_COLLATE=en_US.UTF-8 > ?[5] LC_MONETARY=en_US.UTF-8 ? ?LC_MESSAGES=en_US.UTF-8 > ?[7] LC_PAPER=C ? ? ? ? ? ? ? ? LC_NAME=C > ?[9] LC_ADDRESS=C ? ? ? ? ? ? ? LC_TELEPHONE=C > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > attached base packages: > [1] stats ? ? graphics ?grDevices utils ? ? datasets ?methods ? base > other attached packages: > [1] GEOquery_2.20.8 Biobase_2.14.0 > loaded via a namespace (and not attached): > [1] RCurl_1.9-5 XML_3.9-4 >> > > > ? ? ? ?[[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD COMMENT
0
Entering edit mode
Hi Sean, Thanks a lot for the suggestion. I just tried simple test (> files <- getGEOSuppFiles("GSE23720")) and the problem is gone. But when I tried to get a lot files at once, I got the following error message: > gseids [1] GSE17907 GSE30010 GSE12790 GSE20711 GSE28821 GSE18864 GSE9195 GSE29431 [9] GSE14020 GSE7904 GSE18728 GSE15181 GSE16391 GSE12777 GSE23593 GSE22035 [17] GSE19383 GSE10281 GSE21217 GSE29672 GSE14986 GSE15026 GSE12763 GSE11001 [25] GSE14017 GSE22513 GSE7515 GSE28796 GSE26910 GSE23994 GSE19639 GSE19697 [33] GSE15477 GSE10270 GSE3893 GSE13787 GSE11078 GSE8977 GSE21834 GSE6885 [41] GSE24468 GSE20266 GSE21422 GSE3156 GSE22250 GSE18571 GSE11352 GSE7382 [49] GSE13806 GSE8565 GSE15619 GSE8597 GSE29832 GSE11791 GSE5102 GSE28645 [57] GSE32160 GSE28789 GSE18331 GSE23640 GSE23399 GSE9086 GSE22865 GSE26298 [65] GSE15893 GSE20086 GSE11324 GSE5116 GSE10879 GSE25407 GSE7700 GSE18912 [73] GSE15043 GSE27515 GSE19777 GSE21832 GSE18070 GSE11506 GSE23921 GSE23905 [81] GSE32158 GSE28305 GSE25162 GSE28415 GSE9015 GSE6800 GSE6548 GSE32161 [89] GSE24249 GSE30775 GSE26884 GSE24473 GSE20719 GSE17636 GSE18773 GSE18931 [97] GSE18146 GSE16070 GSE16080 GSE11683 GSE10046 GSE9747 GSE15749 GSE22664 [105] GSE21066 GSE9586 GSE17832 GSE11330 GSE17889 GSE12199 GSE28089 GSE31448 [113] GSE10810 GSE9196 GSE22840 GSE33658 GSE25487 GSE22544 GSE27220 GSE11581 120 Levels: GSE10046 GSE10270 GSE10281 GSE10810 GSE10879 GSE11001 ... GSE9747 > files <- sapply(gseids,getGEOSuppFiles,makeDirectory = TRUE, baseDir = getwd() + ) Error in dir.create(GEO) : invalid 'path' argument [1] "ftp://ftp.ncbi.nih.gov/pub/geo/DATA/supplementary/series/GSE17907/" % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0Warning: Failed to create the file Warning: /media/Passport01/My_DataSets/BreastCancerDataSet/GSE17907/GSE17907_RA Warning: W.tar: No such file or directory 0 328M 0 2896 0 0 3027 0 31:34:35 --:--:-- 31:34:35 3415 curl: (23) Failed writing body (0 != 2896) % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0Warning: Failed to create the file Warning: /media/Passport01/My_DataSets/BreastCancerDataSet/GSE17907/filelist.tx Warning: t: No such file or directory 24 5979 24 1448 0 0 2495 0 0:00:02 --:--:-- 0:00:02 3061 curl: (23) Failed writing body (0 != 1448) Error in dir.create(GEO) : invalid 'path' argument In addition: Warning messages: 1: In download.file(file.path(url, i), destfile = file.path(storedir, : download had nonzero exit status 2: In download.file(file.path(url, i), destfile = file.path(storedir, : download had nonzero exit status [1] "ftp://ftp.ncbi.nih.gov/pub/geo/DATA/supplementary/series/GSE30010/" % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0Warning: Failed to create the file Warning: /media/Passport01/My_DataSets/BreastCancerDataSet/GSE30010/GSE30010_RA Warning: W.tar: No such file or directory 0 576M 0 2896 0 0 5191 0 32:22:29 --:--:-- 32:22:29 6464 curl: (23) Failed writing body (0 != 2896) % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0Warning: Failed to create the file Warning: /media/Passport01/My_DataSets/BreastCancerDataSet/GSE30010/GSE30010_di Warning: scovery_clinical_info.txt.gz: No such file or directory 81 1785 81 1448 0 0 3009 0 --:--:-- --:--:-- --:--:-- 3506 81 1785 81 1448 0 0 1978 0 --:--:-- --:--:-- --:--:-- 1978curl: (23) Failed writing body (0 != 1448) After I killed this job and tried: > file <- getGEOSuppFiles("GSE17907") I had no problem at all. I really do not know what's wrong with the sapply() setting. Any suggestion? Thanks a lot for the help! Ying > Date: Thu, 2 Feb 2012 12:48:56 -0500 > Subject: Re: [BioC] GEOquery Error > From: sdavis2@mail.nih.gov > To: ying_chen@live.com > CC: bioconductor@r-project.org > > On Thu, Feb 2, 2012 at 12:38 PM, ying chen <ying_chen@live.com> wrote: > > > > > > > > Hi, > > > > I want to use GEOquery package to get the raw files of a lot GEO datasets at once ( > files <- sapply(mydata$GSE_ID, getGEOSuppFiles) ), but I got the following error message when I did a simple test run. Any suggestion? > > > > Hi, Ying. > > This is not a GEOquery issue. The directory housing the data is not > on the FTP site. NCBI GEO periodically rebuilds stuff on the site. > That might be occurring now. I'd suggest emailing NCBI GEO directly > if you are in a hurry. Alternatively, wait an hour or two to see if > the problem is resolved. > > Sean > > > >> library(GEOquery) > > Loading required package: Biobase > > Welcome to Bioconductor > > Vignettes contain introductory material. To view, type > > 'browseVignettes()'. To cite Bioconductor, see > > 'citation("Biobase")' and for packages 'citation("pkgname")'. > > Setting options('download.file.method.GEOquery'='curl') > >> files <- getGEOSuppFiles("GSE23720") > > [1] "ftp://ftp.ncbi.nih.gov/pub/geo/DATA/supplementary/series/GSE23720/" > > Error in function (type, msg, asError = TRUE) : > > Server denied you to change to the given directory > >> sessionInfo() > > R version 2.14.1 (2011-12-22) > > Platform: x86_64-pc-linux-gnu (64-bit) > > locale: > > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > > [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 > > [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 > > [7] LC_PAPER=C LC_NAME=C > > [9] LC_ADDRESS=C LC_TELEPHONE=C > > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > > attached base packages: > > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > > [1] GEOquery_2.20.8 Biobase_2.14.0 > > loaded via a namespace (and not attached): > > [1] RCurl_1.9-5 XML_3.9-4 > >> > > > > > > [[alternative HTML version deleted]] > > > > _______________________________________________ > > Bioconductor mailing list > > Bioconductor@r-project.org > > https://stat.ethz.ch/mailman/listinfo/bioconductor > > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
On Thu, Feb 2, 2012 at 11:37 PM, ying chen <ying_chen at="" live.com=""> wrote: > Hi Sean, > > Thanks a lot for the suggestion. I just tried simple test (> files <- > getGEOSuppFiles("GSE23720")) and the problem is gone. > > But when I tried to get a lot files at once, I got the following error > message: > >> gseids > ? [1] GSE17907 GSE30010 GSE12790 GSE20711 GSE28821 GSE18864 GSE9195 > GSE29431 > ? [9] GSE14020 GSE7904? GSE18728 GSE15181 GSE16391 GSE12777 GSE23593 > GSE22035 > ?[17] GSE19383 GSE10281 GSE21217 GSE29672 GSE14986 GSE15026 GSE12763 > GSE11001 > ?[25] GSE14017 GSE22513 GSE7515? GSE28796 GSE26910 GSE23994 GSE19639 > GSE19697 > ?[33] GSE15477 GSE10270 GSE3893? GSE13787 GSE11078 GSE8977? GSE21834 GSE6885 > ?[41] GSE24468 GSE20266 GSE21422 GSE3156? GSE22250 GSE18571 GSE11352 GSE7382 > ?[49] GSE13806 GSE8565? GSE15619 GSE8597? GSE29832 GSE11791 GSE5102 > GSE28645 > ?[57] GSE32160 GSE28789 GSE18331 GSE23640 GSE23399 GSE9086? GSE22865 > GSE26298 > ?[65] GSE15893 GSE20086 GSE11324 GSE5116? GSE10879 GSE25407 GSE7700 > GSE18912 > ?[73] GSE15043 GSE27515 GSE19777 GSE21832 GSE18070 GSE11506 GSE23921 > GSE23905 > ?[81] GSE32158 GSE28305 GSE25162 GSE28415 GSE9015? GSE6800? GSE6548 > GSE32161 > ?[89] GSE24249 GSE30775 GSE26884 GSE24473 GSE20719 GSE17636 GSE18773 > GSE18931 > ?[97] GSE18146 GSE16070 GSE16080 GSE11683 GSE10046 GSE9747? GSE15749 > GSE22664 > [105] GSE21066 GSE9586? GSE17832 GSE11330 GSE17889 GSE12199 GSE28089 > GSE31448 > [113] GSE10810 GSE9196? GSE22840 GSE33658 GSE25487 GSE22544 GSE27220 > GSE11581 > 120 Levels: GSE10046 GSE10270 GSE10281 GSE10810 GSE10879 GSE11001 ... > GSE9747 >> files <- sapply(gseids,getGEOSuppFiles,makeDirectory = TRUE, baseDir = >> getwd() > + ) > Error in dir.create(GEO) : invalid 'path' argument > [1] "ftp://ftp.ncbi.nih.gov/pub/geo/DATA/supplementary/series/GSE17907/" > ? % Total??? % Received % Xferd? Average Speed?? Time??? Time???? Time > Current > ???????????????????????????????? Dload? Upload?? Total?? Spent??? Left > Speed > ? 0???? 0??? 0???? 0??? 0???? 0????? 0????? 0 --:--:-- --:--:-- --:--:-- > 0Warning: Failed to create the file > Warning: > /media/Passport01/My_DataSets/BreastCancerDataSet/GSE17907/GSE17907_RA > Warning: W.tar: No such file or directory > ? 0? 328M??? 0? 2896??? 0???? 0?? 3027????? 0 31:34:35 --:--:-- 31:34:35 > 3415 > curl: (23) Failed writing body (0 != 2896) > ? % Total??? % Received % Xferd? Average Speed?? Time??? Time???? Time > Current > ???????????????????????????????? Dload? Upload?? Total?? Spent??? Left > Speed > ? 0???? 0??? 0???? 0??? 0???? 0????? 0????? 0 --:--:-- --:--:-- --:--:-- > 0Warning: Failed to create the file > Warning: > /media/Passport01/My_DataSets/BreastCancerDataSet/GSE17907/filelist.tx > Warning: t: No such file or directory > ?24? 5979?? 24? 1448??? 0???? 0?? 2495????? 0? 0:00:02 --:--:--? 0:00:02 > 3061 > curl: (23) Failed writing body (0 != 1448) > Error in dir.create(GEO) : invalid 'path' argument > In addition: Warning messages: > 1: In download.file(file.path(url, i), destfile = file.path(storedir,? : > ? download had nonzero exit status > 2: In download.file(file.path(url, i), destfile = file.path(storedir,? : > ? download had nonzero exit status > [1] "ftp://ftp.ncbi.nih.gov/pub/geo/DATA/supplementary/series/GSE30010/" > ? % Total??? % Received % Xferd? Average Speed?? Time??? Time???? Time > Current > ???????????????????????????????? Dload? Upload?? Total?? Spent??? Left > Speed > ? 0???? 0??? 0???? 0??? 0???? 0????? 0????? 0 --:--:-- --:--:-- --:--:-- > 0Warning: Failed to create the file > Warning: > /media/Passport01/My_DataSets/BreastCancerDataSet/GSE30010/GSE30010_RA > Warning: W.tar: No such file or directory > ? 0? 576M??? 0? 2896??? 0???? 0?? 5191????? 0 32:22:29 --:--:-- 32:22:29 > 6464 > curl: (23) Failed writing body (0 != 2896) > ? % Total??? % Received % Xferd? Average Speed?? Time??? Time???? Time > Current > ???????????????????????????????? Dload? Upload?? Total?? Spent??? Left > Speed > ? 0???? 0??? 0???? 0??? 0???? 0????? 0????? 0 --:--:-- --:--:-- --:--:-- > 0Warning: Failed to create the file > Warning: > /media/Passport01/My_DataSets/BreastCancerDataSet/GSE30010/GSE30010_di > Warning: scovery_clinical_info.txt.gz: No such file or directory > ?81? 1785?? 81? 1448??? 0???? 0?? 3009????? 0 --:--:-- --:--:-- --:--:-- > 3506 > ?81? 1785?? 81? 1448??? 0???? 0?? 1978????? 0 --:--:-- --:--:-- --:--:-- > 1978curl: (23) Failed writing body (0 != 1448) It is hard to tell for sure, but I think you might be out of disk space locally. When you get the error, check to see if you have space left on the device to which you are saving. GEOquery should work fine in a loop like this. Sean > After I killed this job and tried: > >> file <- getGEOSuppFiles("GSE17907") > > I had no problem at all. > > I really do not know what's wrong with the sapply() setting. > > Any suggestion? > > Thanks a lot for the help! > > Ying > >> Date: Thu, 2 Feb 2012 12:48:56 -0500 >> Subject: Re: [BioC] GEOquery Error >> From: sdavis2 at mail.nih.gov >> To: ying_chen at live.com >> CC: bioconductor at r-project.org > >> >> On Thu, Feb 2, 2012 at 12:38 PM, ying chen <ying_chen at="" live.com=""> wrote: >> > >> > >> > >> > Hi, >> > >> > I want to use GEOquery package to get the raw files of a lot GEO >> > datasets at once ( > files <- sapply(mydata$GSE_ID, getGEOSuppFiles) ), but >> > I got the following error message when I did a simple test run. Any >> > suggestion? >> > >> >> Hi, Ying. >> >> This is not a GEOquery issue. The directory housing the data is not >> on the FTP site. NCBI GEO periodically rebuilds stuff on the site. >> That might be occurring now. I'd suggest emailing NCBI GEO directly >> if you are in a hurry. Alternatively, wait an hour or two to see if >> the problem is resolved. >> >> Sean >> >> >> >> library(GEOquery) >> > Loading required package: Biobase >> > Welcome to Bioconductor >> > ?Vignettes contain introductory material. To view, type >> > ?'browseVignettes()'. To cite Bioconductor, see >> > ?'citation("Biobase")' and for packages 'citation("pkgname")'. >> > Setting options('download.file.method.GEOquery'='curl') >> >> files <- getGEOSuppFiles("GSE23720") >> > [1] "ftp://ftp.ncbi.nih.gov/pub/geo/DATA/supplementary/series/GSE23720/" >> > Error in function (type, msg, asError = TRUE) ?: >> > ?Server denied you to change to the given directory >> >> sessionInfo() >> > R version 2.14.1 (2011-12-22) >> > Platform: x86_64-pc-linux-gnu (64-bit) >> > locale: >> > ?[1] LC_CTYPE=en_US.UTF-8 ? ? ? LC_NUMERIC=C >> > ?[3] LC_TIME=en_US.UTF-8 ? ? ? ?LC_COLLATE=en_US.UTF-8 >> > ?[5] LC_MONETARY=en_US.UTF-8 ? ?LC_MESSAGES=en_US.UTF-8 >> > ?[7] LC_PAPER=C ? ? ? ? ? ? ? ? LC_NAME=C >> > ?[9] LC_ADDRESS=C ? ? ? ? ? ? ? LC_TELEPHONE=C >> > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C >> > attached base packages: >> > [1] stats ? ? graphics ?grDevices utils ? ? datasets ?methods ? base >> > other attached packages: >> > [1] GEOquery_2.20.8 Biobase_2.14.0 >> > loaded via a namespace (and not attached): >> > [1] RCurl_1.9-5 XML_3.9-4 >> >> >> > >> > >> > ? ? ? ?[[alternative HTML version deleted]] >> > >> > _______________________________________________ >> > Bioconductor mailing list >> > Bioconductor at r-project.org >> > https://stat.ethz.ch/mailman/listinfo/bioconductor >> > Search the archives: >> > http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD REPLY
0
Entering edit mode
Hi Sean, Thanks a lot for the help. I checked my computer and I still have 253GB space left on my hard drive. I tried to retrieve the data over the weekend, but always had the same problem. I just tried to run it again to test on 10 gse ids. At first it gave some error message, but finished the first dataset. Then the program complained about the failure to open the destfile, which seems odd to me as this is the file the program is supposed to download. Now it seems to me that I can download dataset one by one using getGEOSuppFiles, but it always failed if I tried to use sapply with GetGEOSuppFiles to set up to download a list of datasets. Any suggestion? Thanks a lot for the help! Ying > files <- sapply(gseids[1:10],getGEOSuppFiles) Error in dir.create(GEO) : invalid 'path' argument [1] "ftp://ftp.ncbi.nih.gov/pub/geo/DATA/supplementary/series/GSE30010/" trying URL 'ftp://ftp.ncbi.nih.gov/pub/geo/DATA/supplementary/series/G SE30010//GSE30010_RAW.tar' ftp data connection made, file length 605009920 bytes opened URL downloaded 577.0 Mbtrying URL 'ftp://ftp.ncbi.nih.gov/pub/geo/DATA/sup plementary/series/GSE30010//GSE30010_discovery_clinical_info.txt.gz' ftp data connection made, file length 1785 bytes opened URL downloaded 1785 bytestrying URL 'ftp://ftp.ncbi.nih.gov/pub/geo/DATA/s upplementary/series/GSE30010//GSE30010_validation_clinical_info.txt.gz ' ftp data connection made, file length 1681 bytes opened URL downloaded 1681 bytestrying URL 'ftp://ftp.ncbi.nih.gov/pub/geo/DATA/s upplementary/series/GSE30010//filelist.txt' ftp data connection made, file length 5871 bytes opened URL downloaded 5871 bytesError in dir.create(GEO) : invalid 'path' argument [1] "ftp://ftp.ncbi.nih.gov/pub/geo/DATA/supplementary/series/GSE12790/" Error in download.file(file.path(url, i), destfile = file.path(storedir, : cannot open destfile 'H:/My_DataSets/BreastCancerDataSet/GSE12790/GSE12790_RAW.tar', reason 'No such file or directory' > sessionInfo() R version 2.14.0 (2011-10-31) Platform: x86_64-pc-mingw32/x64 (64-bit)locale: [1] LC_COLLATE=English_United States.1252 [2] LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] GEOquery_2.20.8 Biobase_2.14.0 BiocInstaller_1.2.1loaded via a namespace (and not attached): [1] RCurl_1.9-5.1 tools_2.14.0 XML_3.9-4.1 > files <- sapply(gseids[4:10],getGEOSuppFiles) Error in dir.create(GEO) : invalid 'path' argument [1] "ftp://ftp.ncbi.nih.gov/pub/geo/DATA/supplementary/series/GSE9195/" Error in function (type, msg, asError = TRUE) : Server denied you to change to the given directory > files <- getGEOSuppFiles('GSE9195') [1] "ftp://ftp.ncbi.nih.gov/pub/geo/DATA/supplementary/series/GSE9195/" trying URL 'ftp://ftp.ncbi.nih.gov/pub/geo/DATA/supplementary/series/G SE9195//GSE9195_RAW.tar' ftp data connection made, file length 658708480 bytes opened URL downloaded 628.2 Mbtrying URL 'ftp://ftp.ncbi.nih.gov/pub/geo/DATA/sup plementary/series/GSE9195//GSE9195_TAMVALIDATION.RData' ftp data connection made, file length 59288200 bytes opened URL downloaded 56.5 Mbtrying URL 'ftp://ftp.ncbi.nih.gov/pub/geo/DATA/supp lementary/series/GSE9195//GSE9195_TAMVALIDATION_README.txt' Error in download.file(file.path(url, i), destfile = file.path(storedir, : cannot open URL 'ftp://ftp.ncbi.nih.gov/pub/geo/DATA/supplementary/s eries/GSE9195//GSE9195_TAMVALIDATION_README.txt' > > Date: Thu, 2 Feb 2012 23:46:59 -0500 > Subject: Re: [BioC] GEOquery Error > From: sdavis2@mail.nih.gov > To: ying_chen@live.com > CC: bioconductor@r-project.org > > On Thu, Feb 2, 2012 at 11:37 PM, ying chen <ying_chen@live.com> wrote: > > Hi Sean, > > > > Thanks a lot for the suggestion. I just tried simple test (> files <- > > getGEOSuppFiles("GSE23720")) and the problem is gone. > > > > But when I tried to get a lot files at once, I got the following error > > message: > > > >> gseids > > [1] GSE17907 GSE30010 GSE12790 GSE20711 GSE28821 GSE18864 GSE9195 > > GSE29431 > > [9] GSE14020 GSE7904 GSE18728 GSE15181 GSE16391 GSE12777 GSE23593 > > GSE22035 > > [17] GSE19383 GSE10281 GSE21217 GSE29672 GSE14986 GSE15026 GSE12763 > > GSE11001 > > [25] GSE14017 GSE22513 GSE7515 GSE28796 GSE26910 GSE23994 GSE19639 > > GSE19697 > > [33] GSE15477 GSE10270 GSE3893 GSE13787 GSE11078 GSE8977 GSE21834 GSE6885 > > [41] GSE24468 GSE20266 GSE21422 GSE3156 GSE22250 GSE18571 GSE11352 GSE7382 > > [49] GSE13806 GSE8565 GSE15619 GSE8597 GSE29832 GSE11791 GSE5102 > > GSE28645 > > [57] GSE32160 GSE28789 GSE18331 GSE23640 GSE23399 GSE9086 GSE22865 > > GSE26298 > > [65] GSE15893 GSE20086 GSE11324 GSE5116 GSE10879 GSE25407 GSE7700 > > GSE18912 > > [73] GSE15043 GSE27515 GSE19777 GSE21832 GSE18070 GSE11506 GSE23921 > > GSE23905 > > [81] GSE32158 GSE28305 GSE25162 GSE28415 GSE9015 GSE6800 GSE6548 > > GSE32161 > > [89] GSE24249 GSE30775 GSE26884 GSE24473 GSE20719 GSE17636 GSE18773 > > GSE18931 > > [97] GSE18146 GSE16070 GSE16080 GSE11683 GSE10046 GSE9747 GSE15749 > > GSE22664 > > [105] GSE21066 GSE9586 GSE17832 GSE11330 GSE17889 GSE12199 GSE28089 > > GSE31448 > > [113] GSE10810 GSE9196 GSE22840 GSE33658 GSE25487 GSE22544 GSE27220 > > GSE11581 > > 120 Levels: GSE10046 GSE10270 GSE10281 GSE10810 GSE10879 GSE11001 ... > > GSE9747 > >> files <- sapply(gseids,getGEOSuppFiles,makeDirectory = TRUE, baseDir = > >> getwd() > > + ) > > Error in dir.create(GEO) : invalid 'path' argument > > [1] "ftp://ftp.ncbi.nih.gov/pub/geo/DATA/supplementary/series/GSE17907/" > > % Total % Received % Xferd Average Speed Time Time Time > > Current > > Dload Upload Total Spent Left > > Speed > > 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- > > 0Warning: Failed to create the file > > Warning: > > /media/Passport01/My_DataSets/BreastCancerDataSet/GSE17907/GSE17907_RA > > Warning: W.tar: No such file or directory > > 0 328M 0 2896 0 0 3027 0 31:34:35 --:--:-- 31:34:35 > > 3415 > > curl: (23) Failed writing body (0 != 2896) > > % Total % Received % Xferd Average Speed Time Time Time > > Current > > Dload Upload Total Spent Left > > Speed > > 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- > > 0Warning: Failed to create the file > > Warning: > > /media/Passport01/My_DataSets/BreastCancerDataSet/GSE17907/filelist.tx > > Warning: t: No such file or directory > > 24 5979 24 1448 0 0 2495 0 0:00:02 --:--:-- 0:00:02 > > 3061 > > curl: (23) Failed writing body (0 != 1448) > > Error in dir.create(GEO) : invalid 'path' argument > > In addition: Warning messages: > > 1: In download.file(file.path(url, i), destfile = file.path(storedir, : > > download had nonzero exit status > > 2: In download.file(file.path(url, i), destfile = file.path(storedir, : > > download had nonzero exit status > > [1] "ftp://ftp.ncbi.nih.gov/pub/geo/DATA/supplementary/series/GSE30010/" > > % Total % Received % Xferd Average Speed Time Time Time > > Current > > Dload Upload Total Spent Left > > Speed > > 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- > > 0Warning: Failed to create the file > > Warning: > > /media/Passport01/My_DataSets/BreastCancerDataSet/GSE30010/GSE30010_RA > > Warning: W.tar: No such file or directory > > 0 576M 0 2896 0 0 5191 0 32:22:29 --:--:-- 32:22:29 > > 6464 > > curl: (23) Failed writing body (0 != 2896) > > % Total % Received % Xferd Average Speed Time Time Time > > Current > > Dload Upload Total Spent Left > > Speed > > 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- > > 0Warning: Failed to create the file > > Warning: > > /media/Passport01/My_DataSets/BreastCancerDataSet/GSE30010/GSE30010_di > > Warning: scovery_clinical_info.txt.gz: No such file or directory > > 81 1785 81 1448 0 0 3009 0 --:--:-- --:--:-- --:--:-- > > 3506 > > 81 1785 81 1448 0 0 1978 0 --:--:-- --:--:-- --:--:-- > > 1978curl: (23) Failed writing body (0 != 1448) > > It is hard to tell for sure, but I think you might be out of disk > space locally. When you get the error, check to see if you have space > left on the device to which you are saving. GEOquery should work fine > in a loop like this. > > Sean > > > > After I killed this job and tried: > > > >> file <- getGEOSuppFiles("GSE17907") > > > > I had no problem at all. > > > > I really do not know what's wrong with the sapply() setting. > > > > Any suggestion? > > > > Thanks a lot for the help! > > > > Ying > > > >> Date: Thu, 2 Feb 2012 12:48:56 -0500 > >> Subject: Re: [BioC] GEOquery Error > >> From: sdavis2@mail.nih.gov > >> To: ying_chen@live.com > >> CC: bioconductor@r-project.org > > > >> > >> On Thu, Feb 2, 2012 at 12:38 PM, ying chen <ying_chen@live.com> wrote: > >> > > >> > > >> > > >> > Hi, > >> > > >> > I want to use GEOquery package to get the raw files of a lot GEO > >> > datasets at once ( > files <- sapply(mydata$GSE_ID, getGEOSuppFiles) ), but > >> > I got the following error message when I did a simple test run. Any > >> > suggestion? > >> > > >> > >> Hi, Ying. > >> > >> This is not a GEOquery issue. The directory housing the data is not > >> on the FTP site. NCBI GEO periodically rebuilds stuff on the site. > >> That might be occurring now. I'd suggest emailing NCBI GEO directly > >> if you are in a hurry. Alternatively, wait an hour or two to see if > >> the problem is resolved. > >> > >> Sean > >> > >> > >> >> library(GEOquery) > >> > Loading required package: Biobase > >> > Welcome to Bioconductor > >> > Vignettes contain introductory material. To view, type > >> > 'browseVignettes()'. To cite Bioconductor, see > >> > 'citation("Biobase")' and for packages 'citation("pkgname")'. > >> > Setting options('download.file.method.GEOquery'='curl') > >> >> files <- getGEOSuppFiles("GSE23720") > >> > [1] "ftp://ftp.ncbi.nih.gov/pub/geo/DATA/supplementary/series/GSE23720/" > >> > Error in function (type, msg, asError = TRUE) : > >> > Server denied you to change to the given directory > >> >> sessionInfo() > >> > R version 2.14.1 (2011-12-22) > >> > Platform: x86_64-pc-linux-gnu (64-bit) > >> > locale: > >> > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > >> > [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 > >> > [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 > >> > [7] LC_PAPER=C LC_NAME=C > >> > [9] LC_ADDRESS=C LC_TELEPHONE=C > >> > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > >> > attached base packages: > >> > [1] stats graphics grDevices utils datasets methods base > >> > other attached packages: > >> > [1] GEOquery_2.20.8 Biobase_2.14.0 > >> > loaded via a namespace (and not attached): > >> > [1] RCurl_1.9-5 XML_3.9-4 > >> >> > >> > > >> > > >> > [[alternative HTML version deleted]] > >> > > >> > _______________________________________________ > >> > Bioconductor mailing list > >> > Bioconductor@r-project.org > >> > https://stat.ethz.ch/mailman/listinfo/bioconductor > >> > Search the archives: > >> > http://news.gmane.org/gmane.science.biology.informatics.conductor [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
Hi, I tried to retrieve GEO dataset with the GEOquery package as following: file <- getGEOSuppFiles('GSE10046') But it seems that every raw data file I got by this method is corrupted. For example, when I tried to extract the GSE10046_RAW.tar, I got the following error message: Can not open file "H:\...\GSE10046_RAW.tar" as archive. The GSE10046_RAW.tar I got through GEOquery is 27,433 KB. The same dataset I retrieved from GEO website is 27,350KB and I can extract it with no problem. I had retrieved more than 70 dataset raw files by GEOquery and all have the same problem. Anyone has any suggestion what went wrong? Thanks a lot for the help! Ying > sessionInfo() R version 2.14.0 (2011-10-31) Platform: x86_64-pc-mingw32/x64 (64-bit)locale: [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] GEOquery_2.20.8 Biobase_2.14.0 loaded via a namespace (and not attached): [1] RCurl_1.9-5.1 XML_3.9-4.1 > > From: ying_chen@live.com > To: sdavis2@mail.nih.gov > Date: Mon, 6 Feb 2012 11:35:07 -0500 > CC: bioconductor@r-project.org > Subject: Re: [BioC] GEOquery Error > > > Hi Sean, Thanks a lot for the help. I checked my computer and I still have 253GB space left on my hard drive. I tried to retrieve the data over the weekend, but always had the same problem. I just tried to run it again to test on 10 gse ids. At first it gave some error message, but finished the first dataset. Then the program complained about the failure to open the destfile, which seems odd to me as this is the file the program is supposed to download. Now it seems to me that I can download dataset one by one using getGEOSuppFiles, but it always failed if I tried to use sapply with GetGEOSuppFiles to set up to download a list of datasets. Any suggestion? Thanks a lot for the help! Ying > > files <- sapply(gseids[1:10],getGEOSuppFiles) > Error in dir.create(GEO) : invalid 'path' argument > [1] "ftp://ftp.ncbi.nih.gov/pub/geo/DATA/supplementary/series/GSE30010/" > trying URL 'ftp://ftp.ncbi.nih.gov/pub/geo/DATA/supplementary/series /GSE30010//GSE30010_RAW.tar' > ftp data connection made, file length 605009920 bytes > opened URL > downloaded 577.0 Mbtrying URL 'ftp://ftp.ncbi.nih.gov/pub/geo/DATA/s upplementary/series/GSE30010//GSE30010_discovery_clinical_info.txt.gz' > ftp data connection made, file length 1785 bytes > opened URL > downloaded 1785 bytestrying URL 'ftp://ftp.ncbi.nih.gov/pub/geo/DATA /supplementary/series/GSE30010//GSE30010_validation_clinical_info.txt. gz' > ftp data connection made, file length 1681 bytes > opened URL > downloaded 1681 bytestrying URL 'ftp://ftp.ncbi.nih.gov/pub/geo/DATA /supplementary/series/GSE30010//filelist.txt' > ftp data connection made, file length 5871 bytes > opened URL > downloaded 5871 bytesError in dir.create(GEO) : invalid 'path' argument > [1] "ftp://ftp.ncbi.nih.gov/pub/geo/DATA/supplementary/series/GSE12790/" > Error in download.file(file.path(url, i), destfile = file.path(storedir, : > cannot open destfile 'H:/My_DataSets/BreastCancerDataSet/GSE12790/GSE12790_RAW.tar', reason 'No such file or directory' > > sessionInfo() > R version 2.14.0 (2011-10-31) > Platform: x86_64-pc-mingw32/x64 (64-bit)locale: > [1] LC_COLLATE=English_United States.1252 > [2] LC_CTYPE=English_United States.1252 > [3] LC_MONETARY=English_United States.1252 > [4] LC_NUMERIC=C > [5] LC_TIME=English_United States.1252 attached base packages: > [1] stats graphics grDevices utils datasets methods base other attached packages: > [1] GEOquery_2.20.8 Biobase_2.14.0 BiocInstaller_1.2.1loaded via a namespace (and not attached): > [1] RCurl_1.9-5.1 tools_2.14.0 XML_3.9-4.1 > > files <- sapply(gseids[4:10],getGEOSuppFiles) > Error in dir.create(GEO) : invalid 'path' argument > [1] "ftp://ftp.ncbi.nih.gov/pub/geo/DATA/supplementary/series/GSE9195/" > Error in function (type, msg, asError = TRUE) : > Server denied you to change to the given directory > > files <- getGEOSuppFiles('GSE9195') > [1] "ftp://ftp.ncbi.nih.gov/pub/geo/DATA/supplementary/series/GSE9195/" > trying URL 'ftp://ftp.ncbi.nih.gov/pub/geo/DATA/supplementary/series /GSE9195//GSE9195_RAW.tar' > ftp data connection made, file length 658708480 bytes > opened URL > downloaded 628.2 Mbtrying URL 'ftp://ftp.ncbi.nih.gov/pub/geo/DATA/s upplementary/series/GSE9195//GSE9195_TAMVALIDATION.RData' > ftp data connection made, file length 59288200 bytes > opened URL > downloaded 56.5 Mbtrying URL 'ftp://ftp.ncbi.nih.gov/pub/geo/DATA/su pplementary/series/GSE9195//GSE9195_TAMVALIDATION_README.txt' > Error in download.file(file.path(url, i), destfile = file.path(storedir, : > cannot open URL 'ftp://ftp.ncbi.nih.gov/pub/geo/DATA/supplementary /series/GSE9195//GSE9195_TAMVALIDATION_README.txt' > > > > Date: Thu, 2 Feb 2012 23:46:59 -0500 > > Subject: Re: [BioC] GEOquery Error > > From: sdavis2@mail.nih.gov > > To: ying_chen@live.com > > CC: bioconductor@r-project.org > > > > On Thu, Feb 2, 2012 at 11:37 PM, ying chen <ying_chen@live.com> wrote: > > > Hi Sean, > > > > > > Thanks a lot for the suggestion. I just tried simple test (> files <- > > > getGEOSuppFiles("GSE23720")) and the problem is gone. > > > > > > But when I tried to get a lot files at once, I got the following error > > > message: > > > > > >> gseids > > > [1] GSE17907 GSE30010 GSE12790 GSE20711 GSE28821 GSE18864 GSE9195 > > > GSE29431 > > > [9] GSE14020 GSE7904 GSE18728 GSE15181 GSE16391 GSE12777 GSE23593 > > > GSE22035 > > > [17] GSE19383 GSE10281 GSE21217 GSE29672 GSE14986 GSE15026 GSE12763 > > > GSE11001 > > > [25] GSE14017 GSE22513 GSE7515 GSE28796 GSE26910 GSE23994 GSE19639 > > > GSE19697 > > > [33] GSE15477 GSE10270 GSE3893 GSE13787 GSE11078 GSE8977 GSE21834 GSE6885 > > > [41] GSE24468 GSE20266 GSE21422 GSE3156 GSE22250 GSE18571 GSE11352 GSE7382 > > > [49] GSE13806 GSE8565 GSE15619 GSE8597 GSE29832 GSE11791 GSE5102 > > > GSE28645 > > > [57] GSE32160 GSE28789 GSE18331 GSE23640 GSE23399 GSE9086 GSE22865 > > > GSE26298 > > > [65] GSE15893 GSE20086 GSE11324 GSE5116 GSE10879 GSE25407 GSE7700 > > > GSE18912 > > > [73] GSE15043 GSE27515 GSE19777 GSE21832 GSE18070 GSE11506 GSE23921 > > > GSE23905 > > > [81] GSE32158 GSE28305 GSE25162 GSE28415 GSE9015 GSE6800 GSE6548 > > > GSE32161 > > > [89] GSE24249 GSE30775 GSE26884 GSE24473 GSE20719 GSE17636 GSE18773 > > > GSE18931 > > > [97] GSE18146 GSE16070 GSE16080 GSE11683 GSE10046 GSE9747 GSE15749 > > > GSE22664 > > > [105] GSE21066 GSE9586 GSE17832 GSE11330 GSE17889 GSE12199 GSE28089 > > > GSE31448 > > > [113] GSE10810 GSE9196 GSE22840 GSE33658 GSE25487 GSE22544 GSE27220 > > > GSE11581 > > > 120 Levels: GSE10046 GSE10270 GSE10281 GSE10810 GSE10879 GSE11001 ... > > > GSE9747 > > >> files <- sapply(gseids,getGEOSuppFiles,makeDirectory = TRUE, baseDir = > > >> getwd() > > > + ) > > > Error in dir.create(GEO) : invalid 'path' argument > > > [1] "ftp://ftp.ncbi.nih.gov/pub/geo/DATA/supplementary/series/GSE17907/" > > > % Total % Received % Xferd Average Speed Time Time Time > > > Current > > > Dload Upload Total Spent Left > > > Speed > > > 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- > > > 0Warning: Failed to create the file > > > Warning: > > > /media/Passport01/My_DataSets/BreastCancerDataSet/GSE17907/GSE17907_RA > > > Warning: W.tar: No such file or directory > > > 0 328M 0 2896 0 0 3027 0 31:34:35 --:--:-- 31:34:35 > > > 3415 > > > curl: (23) Failed writing body (0 != 2896) > > > % Total % Received % Xferd Average Speed Time Time Time > > > Current > > > Dload Upload Total Spent Left > > > Speed > > > 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- > > > 0Warning: Failed to create the file > > > Warning: > > > /media/Passport01/My_DataSets/BreastCancerDataSet/GSE17907/filelist.tx > > > Warning: t: No such file or directory > > > 24 5979 24 1448 0 0 2495 0 0:00:02 --:--:-- 0:00:02 > > > 3061 > > > curl: (23) Failed writing body (0 != 1448) > > > Error in dir.create(GEO) : invalid 'path' argument > > > In addition: Warning messages: > > > 1: In download.file(file.path(url, i), destfile = file.path(storedir, : > > > download had nonzero exit status > > > 2: In download.file(file.path(url, i), destfile = file.path(storedir, : > > > download had nonzero exit status > > > [1] "ftp://ftp.ncbi.nih.gov/pub/geo/DATA/supplementary/series/GSE30010/" > > > % Total % Received % Xferd Average Speed Time Time Time > > > Current > > > Dload Upload Total Spent Left > > > Speed > > > 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- > > > 0Warning: Failed to create the file > > > Warning: > > > /media/Passport01/My_DataSets/BreastCancerDataSet/GSE30010/GSE30010_RA > > > Warning: W.tar: No such file or directory > > > 0 576M 0 2896 0 0 5191 0 32:22:29 --:--:-- 32:22:29 > > > 6464 > > > curl: (23) Failed writing body (0 != 2896) > > > % Total % Received % Xferd Average Speed Time Time Time > > > Current > > > Dload Upload Total Spent Left > > > Speed > > > 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- > > > 0Warning: Failed to create the file > > > Warning: > > > /media/Passport01/My_DataSets/BreastCancerDataSet/GSE30010/GSE30010_di > > > Warning: scovery_clinical_info.txt.gz: No such file or directory > > > 81 1785 81 1448 0 0 3009 0 --:--:-- --:--:-- --:--:-- > > > 3506 > > > 81 1785 81 1448 0 0 1978 0 --:--:-- --:--:-- --:--:-- > > > 1978curl: (23) Failed writing body (0 != 1448) > > > > It is hard to tell for sure, but I think you might be out of disk > > space locally. When you get the error, check to see if you have space > > left on the device to which you are saving. GEOquery should work fine > > in a loop like this. > > > > Sean > > > > > > > After I killed this job and tried: > > > > > >> file <- getGEOSuppFiles("GSE17907") > > > > > > I had no problem at all. > > > > > > I really do not know what's wrong with the sapply() setting. > > > > > > Any suggestion? > > > > > > Thanks a lot for the help! > > > > > > Ying > > > > > >> Date: Thu, 2 Feb 2012 12:48:56 -0500 > > >> Subject: Re: [BioC] GEOquery Error > > >> From: sdavis2@mail.nih.gov > > >> To: ying_chen@live.com > > >> CC: bioconductor@r-project.org > > > > > >> > > >> On Thu, Feb 2, 2012 at 12:38 PM, ying chen <ying_chen@live.com> wrote: > > >> > > > >> > > > >> > > > >> > Hi, > > >> > > > >> > I want to use GEOquery package to get the raw files of a lot GEO > > >> > datasets at once ( > files <- sapply(mydata$GSE_ID, getGEOSuppFiles) ), but > > >> > I got the following error message when I did a simple test run. Any > > >> > suggestion? > > >> > > > >> > > >> Hi, Ying. > > >> > > >> This is not a GEOquery issue. The directory housing the data is not > > >> on the FTP site. NCBI GEO periodically rebuilds stuff on the site. > > >> That might be occurring now. I'd suggest emailing NCBI GEO directly > > >> if you are in a hurry. Alternatively, wait an hour or two to see if > > >> the problem is resolved. > > >> > > >> Sean > > >> > > >> > > >> >> library(GEOquery) > > >> > Loading required package: Biobase > > >> > Welcome to Bioconductor > > >> > Vignettes contain introductory material. To view, type > > >> > 'browseVignettes()'. To cite Bioconductor, see > > >> > 'citation("Biobase")' and for packages 'citation("pkgname")'. > > >> > Setting options('download.file.method.GEOquery'='curl') > > >> >> files <- getGEOSuppFiles("GSE23720") > > >> > [1] "ftp://ftp.ncbi.nih.gov/pub/geo/DATA/supplementary/series/GSE23720/" > > >> > Error in function (type, msg, asError = TRUE) : > > >> > Server denied you to change to the given directory > > >> >> sessionInfo() > > >> > R version 2.14.1 (2011-12-22) > > >> > Platform: x86_64-pc-linux-gnu (64-bit) > > >> > locale: > > >> > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > > >> > [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 > > >> > [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 > > >> > [7] LC_PAPER=C LC_NAME=C > > >> > [9] LC_ADDRESS=C LC_TELEPHONE=C > > >> > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > > >> > attached base packages: > > >> > [1] stats graphics grDevices utils datasets methods base > > >> > other attached packages: > > >> > [1] GEOquery_2.20.8 Biobase_2.14.0 > > >> > loaded via a namespace (and not attached): > > >> > [1] RCurl_1.9-5 XML_3.9-4 > > >> >> > > >> > > > >> > > > >> > [[alternative HTML version deleted]] > > >> > > > >> > _______________________________________________ > > >> > Bioconductor mailing list > > >> > Bioconductor@r-project.org > > >> > https://stat.ethz.ch/mailman/listinfo/bioconductor > > >> > Search the archives: > > >> > http://news.gmane.org/gmane.science.biology.informatics.conductor > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
On Tue, Feb 7, 2012 at 11:42 AM, ying chen <ying_chen at="" live.com=""> wrote: > > Hi, I tried to retrieve GEO dataset with the GEOquery package as following: > ?file <- getGEOSuppFiles('GSE10046') > But it seems that every raw data file I got by this method is corrupted. For example, when I tried to extract the GSE10046_RAW.tar, I got the following error message: ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? Can not open file "H:\...\GSE10046_RAW.tar" as archive. The GSE10046_RAW.tar I got through GEOquery is 27,433 KB. The same dataset I retrieved from GEO website is 27,350KB and I can extract it with no problem. I had retrieved more than 70 dataset raw files by GEOquery and all have the same problem. Anyone has any suggestion what went wrong? Thanks a lot for the help! Ying > Hi, Ying. I am not able to reproduce your error on either Mac or two flavors of linux. I don't have access to a Windows version of R, but I'll see if I can get access in the next few days to check. Sorry I can't be more helpful right now. Sean > sessionInfo() > R version 2.14.0 (2011-10-31) > Platform: x86_64-pc-mingw32/x64 (64-bit)locale: > [1] LC_COLLATE=English_United States.1252 ?LC_CTYPE=English_United States.1252 > [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C > [5] LC_TIME=English_United States.1252 ? ?attached base packages: > [1] stats ? ? graphics ?grDevices utils ? ? datasets ?methods ? base ? ? other attached packages: > [1] GEOquery_2.20.8 Biobase_2.14.0 loaded via a namespace (and not attached): > [1] RCurl_1.9-5.1 XML_3.9-4.1 >> >> From: ying_chen at live.com >> To: sdavis2 at mail.nih.gov >> Date: Mon, 6 Feb 2012 11:35:07 -0500 >> CC: bioconductor at r-project.org >> Subject: Re: [BioC] GEOquery Error >> >> >> Hi Sean, Thanks a lot for the help. I checked my computer and I still have 253GB space left on my hard drive. I tried to retrieve the data over the weekend, but always had the same problem. I just tried to run it again to test on 10 gse ids. At first it gave some error message, but finished the first dataset. Then the program complained about the failure to open the destfile, which seems odd to me as this is the file the program is supposed to download. Now it seems to me that I can download dataset one by one using getGEOSuppFiles, but it always failed if I tried to use sapply with GetGEOSuppFiles to set up to download a list of datasets. ?Any suggestion? Thanks a lot for the help! Ying >> ?> files <- sapply(gseids[1:10],getGEOSuppFiles) >> Error in dir.create(GEO) : invalid 'path' argument >> [1] "ftp://ftp.ncbi.nih.gov/pub/geo/DATA/supplementary/series/GSE30010/" >> trying URL 'ftp://ftp.ncbi.nih.gov/pub/geo/DATA/supplementary/serie s/GSE30010//GSE30010_RAW.tar' >> ftp data connection made, file length 605009920 bytes >> opened URL >> downloaded 577.0 Mbtrying URL 'ftp://ftp.ncbi.nih.gov/pub/geo/DATA/ supplementary/series/GSE30010//GSE30010_discovery_clinical_info.txt.gz ' >> ftp data connection made, file length 1785 bytes >> opened URL >> downloaded 1785 bytestrying URL 'ftp://ftp.ncbi.nih.gov/pub/geo/DAT A/supplementary/series/GSE30010//GSE30010_validation_clinical_info.txt .gz' >> ftp data connection made, file length 1681 bytes >> opened URL >> downloaded 1681 bytestrying URL 'ftp://ftp.ncbi.nih.gov/pub/geo/DAT A/supplementary/series/GSE30010//filelist.txt' >> ftp data connection made, file length 5871 bytes >> opened URL >> downloaded 5871 bytesError in dir.create(GEO) : invalid 'path' argument >> [1] "ftp://ftp.ncbi.nih.gov/pub/geo/DATA/supplementary/series/GSE12790/" >> Error in download.file(file.path(url, i), destfile = file.path(storedir, ?: >> ? cannot open destfile 'H:/My_DataSets/BreastCancerDataSet/GSE12790/GSE12790_RAW.tar', reason 'No such file or directory' >> > sessionInfo() >> R version 2.14.0 (2011-10-31) >> Platform: x86_64-pc-mingw32/x64 (64-bit)locale: >> [1] LC_COLLATE=English_United States.1252 >> [2] LC_CTYPE=English_United States.1252 >> [3] LC_MONETARY=English_United States.1252 >> [4] LC_NUMERIC=C >> [5] LC_TIME=English_United States.1252 ? ?attached base packages: >> [1] stats ? ? graphics ?grDevices utils ? ? datasets ?methods ? base ? ? other attached packages: >> [1] GEOquery_2.20.8 ? ? Biobase_2.14.0 ? ? ?BiocInstaller_1.2.1loaded via a namespace (and not attached): >> [1] RCurl_1.9-5.1 tools_2.14.0 ?XML_3.9-4.1 >> ?> files <- sapply(gseids[4:10],getGEOSuppFiles) >> Error in dir.create(GEO) : invalid 'path' argument >> [1] "ftp://ftp.ncbi.nih.gov/pub/geo/DATA/supplementary/series/GSE9195/" >> Error in function (type, msg, asError = TRUE) ?: >> ? Server denied you to change to the given directory >> > files <- getGEOSuppFiles('GSE9195') >> [1] "ftp://ftp.ncbi.nih.gov/pub/geo/DATA/supplementary/series/GSE9195/" >> trying URL 'ftp://ftp.ncbi.nih.gov/pub/geo/DATA/supplementary/serie s/GSE9195//GSE9195_RAW.tar' >> ftp data connection made, file length 658708480 bytes >> opened URL >> downloaded 628.2 Mbtrying URL 'ftp://ftp.ncbi.nih.gov/pub/geo/DATA/ supplementary/series/GSE9195//GSE9195_TAMVALIDATION.RData' >> ftp data connection made, file length 59288200 bytes >> opened URL >> downloaded 56.5 Mbtrying URL 'ftp://ftp.ncbi.nih.gov/pub/geo/DATA/s upplementary/series/GSE9195//GSE9195_TAMVALIDATION_README.txt' >> Error in download.file(file.path(url, i), destfile = file.path(storedir, ?: >> ? cannot open URL 'ftp://ftp.ncbi.nih.gov/pub/geo/DATA/supplementar y/series/GSE9195//GSE9195_TAMVALIDATION_README.txt' >> > >> ? ? > Date: Thu, 2 Feb 2012 23:46:59 -0500 >> > Subject: Re: [BioC] GEOquery Error >> > From: sdavis2 at mail.nih.gov >> > To: ying_chen at live.com >> > CC: bioconductor at r-project.org >> > >> > On Thu, Feb 2, 2012 at 11:37 PM, ying chen <ying_chen at="" live.com=""> wrote: >> > > Hi Sean, >> > > >> > > Thanks a lot for the suggestion. I just tried simple test (> files <- >> > > getGEOSuppFiles("GSE23720")) and the problem is gone. >> > > >> > > But when I tried to get a lot files at once, I got the following error >> > > message: >> > > >> > >> gseids >> > > ? [1] GSE17907 GSE30010 GSE12790 GSE20711 GSE28821 GSE18864 GSE9195 >> > > GSE29431 >> > > ? [9] GSE14020 GSE7904 ?GSE18728 GSE15181 GSE16391 GSE12777 GSE23593 >> > > GSE22035 >> > > ?[17] GSE19383 GSE10281 GSE21217 GSE29672 GSE14986 GSE15026 GSE12763 >> > > GSE11001 >> > > ?[25] GSE14017 GSE22513 GSE7515 ?GSE28796 GSE26910 GSE23994 GSE19639 >> > > GSE19697 >> > > ?[33] GSE15477 GSE10270 GSE3893 ?GSE13787 GSE11078 GSE8977 ?GSE21834 GSE6885 >> > > ?[41] GSE24468 GSE20266 GSE21422 GSE3156 ?GSE22250 GSE18571 GSE11352 GSE7382 >> > > ?[49] GSE13806 GSE8565 ?GSE15619 GSE8597 ?GSE29832 GSE11791 GSE5102 >> > > GSE28645 >> > > ?[57] GSE32160 GSE28789 GSE18331 GSE23640 GSE23399 GSE9086 ?GSE22865 >> > > GSE26298 >> > > ?[65] GSE15893 GSE20086 GSE11324 GSE5116 ?GSE10879 GSE25407 GSE7700 >> > > GSE18912 >> > > ?[73] GSE15043 GSE27515 GSE19777 GSE21832 GSE18070 GSE11506 GSE23921 >> > > GSE23905 >> > > ?[81] GSE32158 GSE28305 GSE25162 GSE28415 GSE9015 ?GSE6800 ?GSE6548 >> > > GSE32161 >> > > ?[89] GSE24249 GSE30775 GSE26884 GSE24473 GSE20719 GSE17636 GSE18773 >> > > GSE18931 >> > > ?[97] GSE18146 GSE16070 GSE16080 GSE11683 GSE10046 GSE9747 ?GSE15749 >> > > GSE22664 >> > > [105] GSE21066 GSE9586 ?GSE17832 GSE11330 GSE17889 GSE12199 GSE28089 >> > > GSE31448 >> > > [113] GSE10810 GSE9196 ?GSE22840 GSE33658 GSE25487 GSE22544 GSE27220 >> > > GSE11581 >> > > 120 Levels: GSE10046 GSE10270 GSE10281 GSE10810 GSE10879 GSE11001 ... >> > > GSE9747 >> > >> files <- sapply(gseids,getGEOSuppFiles,makeDirectory = TRUE, baseDir = >> > >> getwd() >> > > + ) >> > > Error in dir.create(GEO) : invalid 'path' argument >> > > [1] "ftp://ftp.ncbi.nih.gov/pub/geo/DATA/supplementary/series/GSE17907/" >> > > ? % Total ? ?% Received % Xferd ?Average Speed ? Time ? ?Time ? ? Time >> > > Current >> > > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?Dload ?Upload ? Total ? Spent ? ?Left >> > > Speed >> > > ? 0 ? ? 0 ? ?0 ? ? 0 ? ?0 ? ? 0 ? ? ?0 ? ? ?0 --:--:-- --:--:-- --:--:-- >> > > 0Warning: Failed to create the file >> > > Warning: >> > > /media/Passport01/My_DataSets/BreastCancerDataSet/GSE17907/GSE17907_RA >> > > Warning: W.tar: No such file or directory >> > > ? 0 ?328M ? ?0 ?2896 ? ?0 ? ? 0 ? 3027 ? ? ?0 31:34:35 --:--:-- 31:34:35 >> > > 3415 >> > > curl: (23) Failed writing body (0 != 2896) >> > > ? % Total ? ?% Received % Xferd ?Average Speed ? Time ? ?Time ? ? Time >> > > Current >> > > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?Dload ?Upload ? Total ? Spent ? ?Left >> > > Speed >> > > ? 0 ? ? 0 ? ?0 ? ? 0 ? ?0 ? ? 0 ? ? ?0 ? ? ?0 --:--:-- --:--:-- --:--:-- >> > > 0Warning: Failed to create the file >> > > Warning: >> > > /media/Passport01/My_DataSets/BreastCancerDataSet/GSE17907/filelist.tx >> > > Warning: t: No such file or directory >> > > ?24 ?5979 ? 24 ?1448 ? ?0 ? ? 0 ? 2495 ? ? ?0 ?0:00:02 --:--:-- ?0:00:02 >> > > 3061 >> > > curl: (23) Failed writing body (0 != 1448) >> > > Error in dir.create(GEO) : invalid 'path' argument >> > > In addition: Warning messages: >> > > 1: In download.file(file.path(url, i), destfile = file.path(storedir, ?: >> > > ? download had nonzero exit status >> > > 2: In download.file(file.path(url, i), destfile = file.path(storedir, ?: >> > > ? download had nonzero exit status >> > > [1] "ftp://ftp.ncbi.nih.gov/pub/geo/DATA/supplementary/series/GSE30010/" >> > > ? % Total ? ?% Received % Xferd ?Average Speed ? Time ? ?Time ? ? Time >> > > Current >> > > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?Dload ?Upload ? Total ? Spent ? ?Left >> > > Speed >> > > ? 0 ? ? 0 ? ?0 ? ? 0 ? ?0 ? ? 0 ? ? ?0 ? ? ?0 --:--:-- --:--:-- --:--:-- >> > > 0Warning: Failed to create the file >> > > Warning: >> > > /media/Passport01/My_DataSets/BreastCancerDataSet/GSE30010/GSE30010_RA >> > > Warning: W.tar: No such file or directory >> > > ? 0 ?576M ? ?0 ?2896 ? ?0 ? ? 0 ? 5191 ? ? ?0 32:22:29 --:--:-- 32:22:29 >> > > 6464 >> > > curl: (23) Failed writing body (0 != 2896) >> > > ? % Total ? ?% Received % Xferd ?Average Speed ? Time ? ?Time ? ? Time >> > > Current >> > > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?Dload ?Upload ? Total ? Spent ? ?Left >> > > Speed >> > > ? 0 ? ? 0 ? ?0 ? ? 0 ? ?0 ? ? 0 ? ? ?0 ? ? ?0 --:--:-- --:--:-- --:--:-- >> > > 0Warning: Failed to create the file >> > > Warning: >> > > /media/Passport01/My_DataSets/BreastCancerDataSet/GSE30010/GSE30010_di >> > > Warning: scovery_clinical_info.txt.gz: No such file or directory >> > > ?81 ?1785 81 1448 0 ? ? 0 ? 3009 ? ? ?0 --:--:-- --:--:-- --:--:-- >> > > 3506 >> > > ?81 ?1785 81 1448 0 ? ? 0 ? 1978 ? ? ?0 --:--:-- --:--:-- --:--:-- >> > > 1978curl: (23) Failed writing body (0 != 1448) >> > >> > It is hard to tell for sure, but I think you might be out of disk >> > space locally. ?When you get the error, check to see if you have space >> > left on the device to which you are saving. ?GEOquery should work fine >> > in a loop like this. >> > >> > Sean >> > >> > >> > > After I killed this job and tried: >> > > >> > >> file <- getGEOSuppFiles("GSE17907") >> > > >> > > I had no problem at all. >> > > >> > > I really do not know what's wrong with the sapply() setting. >> > > >> > > Any suggestion? >> > > >> > > Thanks a lot for the help! >> > > >> > > Ying >> > > >> > >> Date: Thu, 2 Feb 2012 12:48:56 -0500 >> > >> Subject: Re: [BioC] GEOquery Error >> > >> From: sdavis2 at mail.nih.gov >> > >> To: ying_chen at live.com >> > >> CC: bioconductor at r-project.org >> > > >> > >> >> > >> On Thu, Feb 2, 2012 at 12:38 PM, ying chen <ying_chen at="" live.com=""> wrote: >> > >> > >> > >> > >> > >> > >> > >> > Hi, >> > >> > >> > >> > I want to use GEOquery package to get the raw files of a lot GEO >> > >> > datasets at once ( > files <- sapply(mydata$GSE_ID, getGEOSuppFiles) ), but >> > >> > I got the following error message when I did a simple test run. Any >> > >> > suggestion? >> > >> > >> > >> >> > >> Hi, Ying. >> > >> >> > >> This is not a GEOquery issue. The directory housing the data is not >> > >> on the FTP site. NCBI GEO periodically rebuilds stuff on the site. >> > >> That might be occurring now. I'd suggest emailing NCBI GEO directly >> > >> if you are in a hurry. Alternatively, wait an hour or two to see if >> > >> the problem is resolved. >> > >> >> > >> Sean >> > >> >> > >> >> > >> >> library(GEOquery) >> > >> > Loading required package: Biobase >> > >> > Welcome to Bioconductor >> > >> > ?Vignettes contain introductory material. To view, type >> > >> > ?'browseVignettes()'. To cite Bioconductor, see >> > >> > ?'citation("Biobase")' and for packages 'citation("pkgname")'. >> > >> > Setting options('download.file.method.GEOquery'='curl') >> > >> >> files <- getGEOSuppFiles("GSE23720") >> > >> > [1] "ftp://ftp.ncbi.nih.gov/pub/geo/DATA/supplementary/series/GSE23720/" >> > >> > Error in function (type, msg, asError = TRUE) ?: >> > >> > ?Server denied you to change to the given directory >> > >> >> sessionInfo() >> > >> > R version 2.14.1 (2011-12-22) >> > >> > Platform: x86_64-pc-linux-gnu (64-bit) >> > >> > locale: >> > >> > ?[1] LC_CTYPE=en_US.UTF-8 ? ? ? LC_NUMERIC=C >> > >> > ?[3] LC_TIME=en_US.UTF-8 ? ? ? ?LC_COLLATE=en_US.UTF-8 >> > >> > ?[5] LC_MONETARY=en_US.UTF-8 ? ?LC_MESSAGES=en_US.UTF-8 >> > >> > ?[7] LC_PAPER=C ? ? ? ? ? ? ? ? LC_NAME=C >> > >> > ?[9] LC_ADDRESS=C ? ? ? ? ? ? ? LC_TELEPHONE=C >> > >> > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C >> > >> > attached base packages: >> > >> > [1] stats ? ? graphics ?grDevices utils ? ? datasets ?methods ? base >> > >> > other attached packages: >> > >> > [1] GEOquery_2.20.8 Biobase_2.14.0 >> > >> > loaded via a namespace (and not attached): >> > >> > [1] RCurl_1.9-5 XML_3.9-4 >> > >> >> >> > >> > >> > >> > >> > >> > ? ? ? ?[[alternative HTML version deleted]] >> > >> > >> > >> > _______________________________________________ >> > >> > Bioconductor mailing list >> > >> > Bioconductor at r-project.org >> > >> > https://stat.ethz.ch/mailman/listinfo/bioconductor >> > >> > Search the archives: >> > >> > http://news.gmane.org/gmane.science.biology.informatics.conductor >> >> ? ? ? [[alternative HTML version deleted]] >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > > ? ? ? ?[[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD REPLY
0
Entering edit mode
Hi Sean, Thanks a lot for the help. I switched to ubuntu on virtualbox and now have no problem with raw data retrieved through GEOquery. But I just repeated in Windows 7 with R2.14, and my problem is still there. But now at least I can stick with ubuntu. Thanks, Ying > sessionInfo() R version 2.14.1 (2011-12-22) Platform: x86_64-pc-linux-gnu (64-bit)locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=C LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] GEOquery_2.20.8 Biobase_2.14.0 loaded via a namespace (and not attached): [1] RCurl_1.9-5 XML_3.9-4 > > Date: Tue, 7 Feb 2012 14:00:10 -0500 > Subject: Re: [BioC] GEOquery Error : Retrieved files corrupted? > From: sdavis2@mail.nih.gov > To: ying_chen@live.com > CC: bioconductor@r-project.org > > On Tue, Feb 7, 2012 at 11:42 AM, ying chen <ying_chen@live.com> wrote: > > > > Hi, I tried to retrieve GEO dataset with the GEOquery package as following: > > file <- getGEOSuppFiles('GSE10046') > > But it seems that every raw data file I got by this method is corrupted. For example, when I tried to extract the GSE10046_RAW.tar, I got the following error message: Can not open file "H:\...\GSE10046_RAW.tar" as archive. The GSE10046_RAW.tar I got through GEOquery is 27,433 KB. The same dataset I retrieved from GEO website is 27,350KB and I can extract it with no problem. I had retrieved more than 70 dataset raw files by GEOquery and all have the same problem. Anyone has any suggestion what went wrong? Thanks a lot for the help! Ying > > > Hi, Ying. > > I am not able to reproduce your error on either Mac or two flavors of > linux. I don't have access to a Windows version of R, but I'll see if > I can get access in the next few days to check. > > Sorry I can't be more helpful right now. > Sean > > > > > sessionInfo() > > R version 2.14.0 (2011-10-31) > > Platform: x86_64-pc-mingw32/x64 (64-bit)locale: > > [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 > > [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C > > [5] LC_TIME=English_United States.1252 attached base packages: > > [1] stats graphics grDevices utils datasets methods base other attached packages: > > [1] GEOquery_2.20.8 Biobase_2.14.0 loaded via a namespace (and not attached): > > [1] RCurl_1.9-5.1 XML_3.9-4.1 > >> > >> From: ying_chen@live.com > >> To: sdavis2@mail.nih.gov > >> Date: Mon, 6 Feb 2012 11:35:07 -0500 > >> CC: bioconductor@r-project.org > >> Subject: Re: [BioC] GEOquery Error > >> > >> > >> Hi Sean, Thanks a lot for the help. I checked my computer and I still have 253GB space left on my hard drive. I tried to retrieve the data over the weekend, but always had the same problem. I just tried to run it again to test on 10 gse ids. At first it gave some error message, but finished the first dataset. Then the program complained about the failure to open the destfile, which seems odd to me as this is the file the program is supposed to download. Now it seems to me that I can download dataset one by one using getGEOSuppFiles, but it always failed if I tried to use sapply with GetGEOSuppFiles to set up to download a list of datasets. Any suggestion? Thanks a lot for the help! Ying > >> > files <- sapply(gseids[1:10],getGEOSuppFiles) > >> Error in dir.create(GEO) : invalid 'path' argument > >> [1] "ftp://ftp.ncbi.nih.gov/pub/geo/DATA/supplementary/series/GSE30010/" > >> trying URL 'ftp://ftp.ncbi.nih.gov/pub/geo/DATA/supplementary/ser ies/GSE30010//GSE30010_RAW.tar' > >> ftp data connection made, file length 605009920 bytes > >> opened URL > >> downloaded 577.0 Mbtrying URL 'ftp://ftp.ncbi.nih.gov/pub/geo/DAT A/supplementary/series/GSE30010//GSE30010_discovery_clinical_info.txt. gz' > >> ftp data connection made, file length 1785 bytes > >> opened URL > >> downloaded 1785 bytestrying URL 'ftp://ftp.ncbi.nih.gov/pub/geo/D ATA/supplementary/series/GSE30010//GSE30010_validation_clinical_info.t xt.gz' > >> ftp data connection made, file length 1681 bytes > >> opened URL > >> downloaded 1681 bytestrying URL 'ftp://ftp.ncbi.nih.gov/pub/geo/D ATA/supplementary/series/GSE30010//filelist.txt' > >> ftp data connection made, file length 5871 bytes > >> opened URL > >> downloaded 5871 bytesError in dir.create(GEO) : invalid 'path' argument > >> [1] "ftp://ftp.ncbi.nih.gov/pub/geo/DATA/supplementary/series/GSE12790/" > >> Error in download.file(file.path(url, i), destfile = file.path(storedir, : > >> cannot open destfile 'H:/My_DataSets/BreastCancerDataSet/GSE12790/GSE12790_RAW.tar', reason 'No such file or directory' > >> > sessionInfo() > >> R version 2.14.0 (2011-10-31) > >> Platform: x86_64-pc-mingw32/x64 (64-bit)locale: > >> [1] LC_COLLATE=English_United States.1252 > >> [2] LC_CTYPE=English_United States.1252 > >> [3] LC_MONETARY=English_United States.1252 > >> [4] LC_NUMERIC=C > >> [5] LC_TIME=English_United States.1252 attached base packages: > >> [1] stats graphics grDevices utils datasets methods base other attached packages: > >> [1] GEOquery_2.20.8 Biobase_2.14.0 BiocInstaller_1.2.1loaded via a namespace (and not attached): > >> [1] RCurl_1.9-5.1 tools_2.14.0 XML_3.9-4.1 > >> > files <- sapply(gseids[4:10],getGEOSuppFiles) > >> Error in dir.create(GEO) : invalid 'path' argument > >> [1] "ftp://ftp.ncbi.nih.gov/pub/geo/DATA/supplementary/series/GSE9195/" > >> Error in function (type, msg, asError = TRUE) : > >> Server denied you to change to the given directory > >> > files <- getGEOSuppFiles('GSE9195') > >> [1] "ftp://ftp.ncbi.nih.gov/pub/geo/DATA/supplementary/series/GSE9195/" > >> trying URL 'ftp://ftp.ncbi.nih.gov/pub/geo/DATA/supplementary/ser ies/GSE9195//GSE9195_RAW.tar' > >> ftp data connection made, file length 658708480 bytes > >> opened URL > >> downloaded 628.2 Mbtrying URL 'ftp://ftp.ncbi.nih.gov/pub/geo/DAT A/supplementary/series/GSE9195//GSE9195_TAMVALIDATION.RData' > >> ftp data connection made, file length 59288200 bytes > >> opened URL > >> downloaded 56.5 Mbtrying URL 'ftp://ftp.ncbi.nih.gov/pub/geo/DATA /supplementary/series/GSE9195//GSE9195_TAMVALIDATION_README.txt' > >> Error in download.file(file.path(url, i), destfile = file.path(storedir, : > >> cannot open URL 'ftp://ftp.ncbi.nih.gov/pub/geo/DATA/supplement ary/series/GSE9195//GSE9195_TAMVALIDATION_README.txt' > >> > > >> > Date: Thu, 2 Feb 2012 23:46:59 -0500 > >> > Subject: Re: [BioC] GEOquery Error > >> > From: sdavis2@mail.nih.gov > >> > To: ying_chen@live.com > >> > CC: bioconductor@r-project.org > >> > > >> > On Thu, Feb 2, 2012 at 11:37 PM, ying chen <ying_chen@live.com> wrote: > >> > > Hi Sean, > >> > > > >> > > Thanks a lot for the suggestion. I just tried simple test (> files <- > >> > > getGEOSuppFiles("GSE23720")) and the problem is gone. > >> > > > >> > > But when I tried to get a lot files at once, I got the following error > >> > > message: > >> > > > >> > >> gseids > >> > > [1] GSE17907 GSE30010 GSE12790 GSE20711 GSE28821 GSE18864 GSE9195 > >> > > GSE29431 > >> > > [9] GSE14020 GSE7904 GSE18728 GSE15181 GSE16391 GSE12777 GSE23593 > >> > > GSE22035 > >> > > [17] GSE19383 GSE10281 GSE21217 GSE29672 GSE14986 GSE15026 GSE12763 > >> > > GSE11001 > >> > > [25] GSE14017 GSE22513 GSE7515 GSE28796 GSE26910 GSE23994 GSE19639 > >> > > GSE19697 > >> > > [33] GSE15477 GSE10270 GSE3893 GSE13787 GSE11078 GSE8977 GSE21834 GSE6885 > >> > > [41] GSE24468 GSE20266 GSE21422 GSE3156 GSE22250 GSE18571 GSE11352 GSE7382 > >> > > [49] GSE13806 GSE8565 GSE15619 GSE8597 GSE29832 GSE11791 GSE5102 > >> > > GSE28645 > >> > > [57] GSE32160 GSE28789 GSE18331 GSE23640 GSE23399 GSE9086 GSE22865 > >> > > GSE26298 > >> > > [65] GSE15893 GSE20086 GSE11324 GSE5116 GSE10879 GSE25407 GSE7700 > >> > > GSE18912 > >> > > [73] GSE15043 GSE27515 GSE19777 GSE21832 GSE18070 GSE11506 GSE23921 > >> > > GSE23905 > >> > > [81] GSE32158 GSE28305 GSE25162 GSE28415 GSE9015 GSE6800 GSE6548 > >> > > GSE32161 > >> > > [89] GSE24249 GSE30775 GSE26884 GSE24473 GSE20719 GSE17636 GSE18773 > >> > > GSE18931 > >> > > [97] GSE18146 GSE16070 GSE16080 GSE11683 GSE10046 GSE9747 GSE15749 > >> > > GSE22664 > >> > > [105] GSE21066 GSE9586 GSE17832 GSE11330 GSE17889 GSE12199 GSE28089 > >> > > GSE31448 > >> > > [113] GSE10810 GSE9196 GSE22840 GSE33658 GSE25487 GSE22544 GSE27220 > >> > > GSE11581 > >> > > 120 Levels: GSE10046 GSE10270 GSE10281 GSE10810 GSE10879 GSE11001 ... > >> > > GSE9747 > >> > >> files <- sapply(gseids,getGEOSuppFiles,makeDirectory = TRUE, baseDir = > >> > >> getwd() > >> > > + ) > >> > > Error in dir.create(GEO) : invalid 'path' argument > >> > > [1] "ftp://ftp.ncbi.nih.gov/pub/geo/DATA/supplementary/series/GSE17907/" > >> > > % Total % Received % Xferd Average Speed Time Time Time > >> > > Current > >> > > Dload Upload Total Spent Left > >> > > Speed > >> > > 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- > >> > > 0Warning: Failed to create the file > >> > > Warning: > >> > > /media/Passport01/My_DataSets/BreastCancerDataSet/GSE17907/GSE17907_RA > >> > > Warning: W.tar: No such file or directory > >> > > 0 328M 0 2896 0 0 3027 0 31:34:35 --:--:-- 31:34:35 > >> > > 3415 > >> > > curl: (23) Failed writing body (0 != 2896) > >> > > % Total % Received % Xferd Average Speed Time Time Time > >> > > Current > >> > > Dload Upload Total Spent Left > >> > > Speed > >> > > 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- > >> > > 0Warning: Failed to create the file > >> > > Warning: > >> > > /media/Passport01/My_DataSets/BreastCancerDataSet/GSE17907/filelist.tx > >> > > Warning: t: No such file or directory > >> > > 24 5979 24 1448 0 0 2495 0 0:00:02 --:--:-- 0:00:02 > >> > > 3061 > >> > > curl: (23) Failed writing body (0 != 1448) > >> > > Error in dir.create(GEO) : invalid 'path' argument > >> > > In addition: Warning messages: > >> > > 1: In download.file(file.path(url, i), destfile = file.path(storedir, : > >> > > download had nonzero exit status > >> > > 2: In download.file(file.path(url, i), destfile = file.path(storedir, : > >> > > download had nonzero exit status > >> > > [1] "ftp://ftp.ncbi.nih.gov/pub/geo/DATA/supplementary/series/GSE30010/" > >> > > % Total % Received % Xferd Average Speed Time Time Time > >> > > Current > >> > > Dload Upload Total Spent Left > >> > > Speed > >> > > 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- > >> > > 0Warning: Failed to create the file > >> > > Warning: > >> > > /media/Passport01/My_DataSets/BreastCancerDataSet/GSE30010/GSE30010_RA > >> > > Warning: W.tar: No such file or directory > >> > > 0 576M 0 2896 0 0 5191 0 32:22:29 --:--:-- 32:22:29 > >> > > 6464 > >> > > curl: (23) Failed writing body (0 != 2896) > >> > > % Total % Received % Xferd Average Speed Time Time Time > >> > > Current > >> > > Dload Upload Total Spent Left > >> > > Speed > >> > > 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- > >> > > 0Warning: Failed to create the file > >> > > Warning: > >> > > /media/Passport01/My_DataSets/BreastCancerDataSet/GSE30010/GSE30010_di > >> > > Warning: scovery_clinical_info.txt.gz: No such file or directory > >> > > 81 1785 81 1448 0 0 3009 0 --:--:-- --:--:-- --:--:-- > >> > > 3506 > >> > > 81 1785 81 1448 0 0 1978 0 --:--:-- --:--:-- --:--:-- > >> > > 1978curl: (23) Failed writing body (0 != 1448) > >> > > >> > It is hard to tell for sure, but I think you might be out of disk > >> > space locally. When you get the error, check to see if you have space > >> > left on the device to which you are saving. GEOquery should work fine > >> > in a loop like this. > >> > > >> > Sean > >> > > >> > > >> > > After I killed this job and tried: > >> > > > >> > >> file <- getGEOSuppFiles("GSE17907") > >> > > > >> > > I had no problem at all. > >> > > > >> > > I really do not know what's wrong with the sapply() setting. > >> > > > >> > > Any suggestion? > >> > > > >> > > Thanks a lot for the help! > >> > > > >> > > Ying > >> > > > >> > >> Date: Thu, 2 Feb 2012 12:48:56 -0500 > >> > >> Subject: Re: [BioC] GEOquery Error > >> > >> From: sdavis2@mail.nih.gov > >> > >> To: ying_chen@live.com > >> > >> CC: bioconductor@r-project.org > >> > > > >> > >> > >> > >> On Thu, Feb 2, 2012 at 12:38 PM, ying chen <ying_chen@live.com> wrote: > >> > >> > > >> > >> > > >> > >> > > >> > >> > Hi, > >> > >> > > >> > >> > I want to use GEOquery package to get the raw files of a lot GEO > >> > >> > datasets at once ( > files <- sapply(mydata$GSE_ID, getGEOSuppFiles) ), but > >> > >> > I got the following error message when I did a simple test run. Any > >> > >> > suggestion? > >> > >> > > >> > >> > >> > >> Hi, Ying. > >> > >> > >> > >> This is not a GEOquery issue. The directory housing the data is not > >> > >> on the FTP site. NCBI GEO periodically rebuilds stuff on the site. > >> > >> That might be occurring now. I'd suggest emailing NCBI GEO directly > >> > >> if you are in a hurry. Alternatively, wait an hour or two to see if > >> > >> the problem is resolved. > >> > >> > >> > >> Sean > >> > >> > >> > >> > >> > >> >> library(GEOquery) > >> > >> > Loading required package: Biobase > >> > >> > Welcome to Bioconductor > >> > >> > Vignettes contain introductory material. To view, type > >> > >> > 'browseVignettes()'. To cite Bioconductor, see > >> > >> > 'citation("Biobase")' and for packages 'citation("pkgname")'. > >> > >> > Setting options('download.file.method.GEOquery'='curl') > >> > >> >> files <- getGEOSuppFiles("GSE23720") > >> > >> > [1] "ftp://ftp.ncbi.nih.gov/pub/geo/DATA/supplementary/series/GSE23720/" > >> > >> > Error in function (type, msg, asError = TRUE) : > >> > >> > Server denied you to change to the given directory > >> > >> >> sessionInfo() > >> > >> > R version 2.14.1 (2011-12-22) > >> > >> > Platform: x86_64-pc-linux-gnu (64-bit) > >> > >> > locale: > >> > >> > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > >> > >> > [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 > >> > >> > [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 > >> > >> > [7] LC_PAPER=C LC_NAME=C > >> > >> > [9] LC_ADDRESS=C LC_TELEPHONE=C > >> > >> > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > >> > >> > attached base packages: > >> > >> > [1] stats graphics grDevices utils datasets methods base > >> > >> > other attached packages: > >> > >> > [1] GEOquery_2.20.8 Biobase_2.14.0 > >> > >> > loaded via a namespace (and not attached): > >> > >> > [1] RCurl_1.9-5 XML_3.9-4 > >> > >> >> > >> > >> > > >> > >> > > >> > >> > [[alternative HTML version deleted]] > >> > >> > > >> > >> > _______________________________________________ > >> > >> > Bioconductor mailing list > >> > >> > Bioconductor@r-project.org > >> > >> > https://stat.ethz.ch/mailman/listinfo/bioconductor > >> > >> > Search the archives: > >> > >> > http://news.gmane.org/gmane.science.biology.informatics.conductor > >> > >> [[alternative HTML version deleted]] > >> > >> _______________________________________________ > >> Bioconductor mailing list > >> Bioconductor@r-project.org > >> https://stat.ethz.ch/mailman/listinfo/bioconductor > >> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > > > > [[alternative HTML version deleted]] > > > > _______________________________________________ > > Bioconductor mailing list > > Bioconductor@r-project.org > > https://stat.ethz.ch/mailman/listinfo/bioconductor > > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
Dear Ying and Sean, a wild guess based on the problem description that sounds too familiar: corruption of binary files is likely to occur if they are transferred via ftp text mode instead of binary mode from Linux/UNIX to Windows. Hmmm, but then getGEOSuppFiles() would never have worked on Windows... maybe something has changed recently in GEOquery or the underlying code for file transfer? Cheers, - axel Axel Klenk Research Informatician Actelion Pharmaceuticals Ltd / Gewerbestrasse 16 / CH-4123 Allschwil / Switzerland From: ying chen <ying_chen at="" live.com=""> To: <sdavis2 at="" mail.nih.gov=""> Cc: bioconductor at r-project.org Date: 07.02.2012 20:18 Subject: Re: [BioC] GEOquery Error : Retrieved files corrupted? Sent by: bioconductor-bounces at r-project.org Hi Sean, Thanks a lot for the help. I switched to ubuntu on virtualbox and now have no problem with raw data retrieved through GEOquery. But I just repeated in Windows 7 with R2.14, and my problem is still there. But now at least I can stick with ubuntu. Thanks, Ying > sessionInfo() R version 2.14.1 (2011-12-22) Platform: x86_64-pc-linux-gnu (64-bit)locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=C LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] GEOquery_2.20.8 Biobase_2.14.0 loaded via a namespace (and not attached): [1] RCurl_1.9-5 XML_3.9-4 > > Date: Tue, 7 Feb 2012 14:00:10 -0500 > Subject: Re: [BioC] GEOquery Error : Retrieved files corrupted? > From: sdavis2 at mail.nih.gov > To: ying_chen at live.com > CC: bioconductor at r-project.org > > On Tue, Feb 7, 2012 at 11:42 AM, ying chen <ying_chen at="" live.com=""> wrote: > > > > Hi, I tried to retrieve GEO dataset with the GEOquery package as following: > > file <- getGEOSuppFiles('GSE10046') > > But it seems that every raw data file I got by this method is corrupted. For example, when I tried to extract the GSE10046_RAW.tar, I got the following error message: Can not open file "H:\...\GSE10046_RAW.tar" as archive. The GSE10046_RAW.tar I got through GEOquery is 27,433 KB. The same dataset I retrieved from GEO website is 27,350KB and I can extract it with no problem. I had retrieved more than 70 dataset raw files by GEOquery and all have the same problem. Anyone has any suggestion what went wrong? Thanks a lot for the help! Ying > > > Hi, Ying. > > I am not able to reproduce your error on either Mac or two flavors of > linux. I don't have access to a Windows version of R, but I'll see if > I can get access in the next few days to check. > > Sorry I can't be more helpful right now. > Sean > > > > > sessionInfo() > > R version 2.14.0 (2011-10-31) > > Platform: x86_64-pc-mingw32/x64 (64-bit)locale: > > [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 > > [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C > > [5] LC_TIME=English_United States.1252 attached base packages: > > [1] stats graphics grDevices utils datasets methods base other attached packages: > > [1] GEOquery_2.20.8 Biobase_2.14.0 loaded via a namespace (and not attached): > > [1] RCurl_1.9-5.1 XML_3.9-4.1 > >> > >> From: ying_chen at live.com > >> To: sdavis2 at mail.nih.gov > >> Date: Mon, 6 Feb 2012 11:35:07 -0500 > >> CC: bioconductor at r-project.org > >> Subject: Re: [BioC] GEOquery Error > >> > >> > >> Hi Sean, Thanks a lot for the help. I checked my computer and I still have 253GB space left on my hard drive. I tried to retrieve the data over the weekend, but always had the same problem. I just tried to run it again to test on 10 gse ids. At first it gave some error message, but finished the first dataset. Then the program complained about the failure to open the destfile, which seems odd to me as this is the file the program is supposed to download. Now it seems to me that I can download dataset one by one using getGEOSuppFiles, but it always failed if I tried to use sapply with GetGEOSuppFiles to set up to download a list of datasets. Any suggestion? Thanks a lot for the help! Ying > >> > files <- sapply(gseids[1:10],getGEOSuppFiles) > >> Error in dir.create(GEO) : invalid 'path' argument > >> [1] " ftp://ftp.ncbi.nih.gov/pub/geo/DATA/supplementary/series/GSE30010/" > >> trying URL 'ftp://ftp.ncbi.nih.gov/pub/geo/DATA/supplementary/series/GSE30010//GS E30010_RAW.tar' > >> ftp data connection made, file length 605009920 bytes > >> opened URL > >> downloaded 577.0 Mbtrying URL 'ftp://ftp.ncbi.nih.gov/pub/geo/DATA/supplementary/series/GSE30010//GS E30010_discovery_clinical_info.txt.gz' > >> ftp data connection made, file length 1785 bytes > >> opened URL > >> downloaded 1785 bytestrying URL 'ftp://ftp.ncbi.nih.gov/pub/geo/DATA/supplementary/series/GSE30010//GS E30010_validation_clinical_info.txt.gz' > >> ftp data connection made, file length 1681 bytes > >> opened URL > >> downloaded 1681 bytestrying URL 'ftp://ftp.ncbi.nih.gov/pub/geo/DATA/supplementary/series/GSE30010//fi lelist.txt' > >> ftp data connection made, file length 5871 bytes > >> opened URL > >> downloaded 5871 bytesError in dir.create(GEO) : invalid 'path' argument > >> [1] " ftp://ftp.ncbi.nih.gov/pub/geo/DATA/supplementary/series/GSE12790/" > >> Error in download.file(file.path(url, i), destfile = file.path(storedir, : > >> cannot open destfile 'H:/My_DataSets/BreastCancerDataSet/GSE12790/GSE12790_RAW.tar', reason 'No such file or directory' > >> > sessionInfo() > >> R version 2.14.0 (2011-10-31) > >> Platform: x86_64-pc-mingw32/x64 (64-bit)locale: > >> [1] LC_COLLATE=English_United States.1252 > >> [2] LC_CTYPE=English_United States.1252 > >> [3] LC_MONETARY=English_United States.1252 > >> [4] LC_NUMERIC=C > >> [5] LC_TIME=English_United States.1252 attached base packages: > >> [1] stats graphics grDevices utils datasets methods base other attached packages: > >> [1] GEOquery_2.20.8 Biobase_2.14.0 BiocInstaller_1.2.1loaded via a namespace (and not attached): > >> [1] RCurl_1.9-5.1 tools_2.14.0 XML_3.9-4.1 > >> > files <- sapply(gseids[4:10],getGEOSuppFiles) > >> Error in dir.create(GEO) : invalid 'path' argument > >> [1] " ftp://ftp.ncbi.nih.gov/pub/geo/DATA/supplementary/series/GSE9195/" > >> Error in function (type, msg, asError = TRUE) : > >> Server denied you to change to the given directory > >> > files <- getGEOSuppFiles('GSE9195') > >> [1] " ftp://ftp.ncbi.nih.gov/pub/geo/DATA/supplementary/series/GSE9195/" > >> trying URL 'ftp://ftp.ncbi.nih.gov/pub/geo/DATA/supplementary/series/GSE9195//GSE 9195_RAW.tar' > >> ftp data connection made, file length 658708480 bytes > >> opened URL > >> downloaded 628.2 Mbtrying URL 'ftp://ftp.ncbi.nih.gov/pub/geo/DATA/supplementary/series/GSE9195//GSE 9195_TAMVALIDATION.RData' > >> ftp data connection made, file length 59288200 bytes > >> opened URL > >> downloaded 56.5 Mbtrying URL 'ftp://ftp.ncbi.nih.gov/pub/geo/DATA/supplementary/series/GSE9195//GSE 9195_TAMVALIDATION_README.txt' > >> Error in download.file(file.path(url, i), destfile = file.path(storedir, : > >> cannot open URL 'ftp://ftp.ncbi.nih.gov/pub/geo/DATA/supplementary/series/GSE9195//GSE 9195_TAMVALIDATION_README.txt' > >> > > >> > Date: Thu, 2 Feb 2012 23:46:59 -0500 > >> > Subject: Re: [BioC] GEOquery Error > >> > From: sdavis2 at mail.nih.gov > >> > To: ying_chen at live.com > >> > CC: bioconductor at r-project.org > >> > > >> > On Thu, Feb 2, 2012 at 11:37 PM, ying chen <ying_chen at="" live.com=""> wrote: > >> > > Hi Sean, > >> > > > >> > > Thanks a lot for the suggestion. I just tried simple test (> files <- > >> > > getGEOSuppFiles("GSE23720")) and the problem is gone. > >> > > > >> > > But when I tried to get a lot files at once, I got the following error > >> > > message: > >> > > > >> > >> gseids > >> > > [1] GSE17907 GSE30010 GSE12790 GSE20711 GSE28821 GSE18864 GSE9195 > >> > > GSE29431 > >> > > [9] GSE14020 GSE7904 GSE18728 GSE15181 GSE16391 GSE12777 GSE23593 > >> > > GSE22035 > >> > > [17] GSE19383 GSE10281 GSE21217 GSE29672 GSE14986 GSE15026 GSE12763 > >> > > GSE11001 > >> > > [25] GSE14017 GSE22513 GSE7515 GSE28796 GSE26910 GSE23994 GSE19639 > >> > > GSE19697 > >> > > [33] GSE15477 GSE10270 GSE3893 GSE13787 GSE11078 GSE8977 GSE21834 GSE6885 > >> > > [41] GSE24468 GSE20266 GSE21422 GSE3156 GSE22250 GSE18571 GSE11352 GSE7382 > >> > > [49] GSE13806 GSE8565 GSE15619 GSE8597 GSE29832 GSE11791 GSE5102 > >> > > GSE28645 > >> > > [57] GSE32160 GSE28789 GSE18331 GSE23640 GSE23399 GSE9086 GSE22865 > >> > > GSE26298 > >> > > [65] GSE15893 GSE20086 GSE11324 GSE5116 GSE10879 GSE25407 GSE7700 > >> > > GSE18912 > >> > > [73] GSE15043 GSE27515 GSE19777 GSE21832 GSE18070 GSE11506 GSE23921 > >> > > GSE23905 > >> > > [81] GSE32158 GSE28305 GSE25162 GSE28415 GSE9015 GSE6800 GSE6548 > >> > > GSE32161 > >> > > [89] GSE24249 GSE30775 GSE26884 GSE24473 GSE20719 GSE17636 GSE18773 > >> > > GSE18931 > >> > > [97] GSE18146 GSE16070 GSE16080 GSE11683 GSE10046 GSE9747 GSE15749 > >> > > GSE22664 > >> > > [105] GSE21066 GSE9586 GSE17832 GSE11330 GSE17889 GSE12199 GSE28089 > >> > > GSE31448 > >> > > [113] GSE10810 GSE9196 GSE22840 GSE33658 GSE25487 GSE22544 GSE27220 > >> > > GSE11581 > >> > > 120 Levels: GSE10046 GSE10270 GSE10281 GSE10810 GSE10879 GSE11001 ... > >> > > GSE9747 > >> > >> files <- sapply(gseids,getGEOSuppFiles,makeDirectory = TRUE, baseDir = > >> > >> getwd() > >> > > + ) > >> > > Error in dir.create(GEO) : invalid 'path' argument > >> > > [1] " ftp://ftp.ncbi.nih.gov/pub/geo/DATA/supplementary/series/GSE17907/" > >> > > % Total % Received % Xferd Average Speed Time Time Time > >> > > Current > >> > > Dload Upload Total Spent Left > >> > > Speed > >> > > 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- > >> > > 0Warning: Failed to create the file > >> > > Warning: > >> > > /media/Passport01/My_DataSets/BreastCancerDataSet/GSE17907/GSE17907_RA > >> > > Warning: W.tar: No such file or directory > >> > > 0 328M 0 2896 0 0 3027 0 31:34:35 --:--:-- 31:34:35 > >> > > 3415 > >> > > curl: (23) Failed writing body (0 != 2896) > >> > > % Total % Received % Xferd Average Speed Time Time Time > >> > > Current > >> > > Dload Upload Total Spent Left > >> > > Speed > >> > > 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- > >> > > 0Warning: Failed to create the file > >> > > Warning: > >> > > /media/Passport01/My_DataSets/BreastCancerDataSet/GSE17907/filelist.tx > >> > > Warning: t: No such file or directory > >> > > 24 5979 24 1448 0 0 2495 0 0:00:02 --:--:-- 0:00:02 > >> > > 3061 > >> > > curl: (23) Failed writing body (0 != 1448) > >> > > Error in dir.create(GEO) : invalid 'path' argument > >> > > In addition: Warning messages: > >> > > 1: In download.file(file.path(url, i), destfile = file.path(storedir, : > >> > > download had nonzero exit status > >> > > 2: In download.file(file.path(url, i), destfile = file.path(storedir, : > >> > > download had nonzero exit status > >> > > [1] " ftp://ftp.ncbi.nih.gov/pub/geo/DATA/supplementary/series/GSE30010/" > >> > > % Total % Received % Xferd Average Speed Time Time Time > >> > > Current > >> > > Dload Upload Total Spent Left > >> > > Speed > >> > > 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- > >> > > 0Warning: Failed to create the file > >> > > Warning: > >> > > /media/Passport01/My_DataSets/BreastCancerDataSet/GSE30010/GSE30010_RA > >> > > Warning: W.tar: No such file or directory > >> > > 0 576M 0 2896 0 0 5191 0 32:22:29 --:--:-- 32:22:29 > >> > > 6464 > >> > > curl: (23) Failed writing body (0 != 2896) > >> > > % Total % Received % Xferd Average Speed Time Time Time > >> > > Current > >> > > Dload Upload Total Spent Left > >> > > Speed > >> > > 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- > >> > > 0Warning: Failed to create the file > >> > > Warning: > >> > > /media/Passport01/My_DataSets/BreastCancerDataSet/GSE30010/GSE30010_di > >> > > Warning: scovery_clinical_info.txt.gz: No such file or directory > >> > > 81 1785 81 1448 0 0 3009 0 --:--:-- --:--:-- --:--:-- > >> > > 3506 > >> > > 81 1785 81 1448 0 0 1978 0 --:--:-- --:--:-- --:--:-- > >> > > 1978curl: (23) Failed writing body (0 != 1448) > >> > > >> > It is hard to tell for sure, but I think you might be out of disk > >> > space locally. When you get the error, check to see if you have space > >> > left on the device to which you are saving. GEOquery should work fine > >> > in a loop like this. > >> > > >> > Sean > >> > > >> > > >> > > After I killed this job and tried: > >> > > > >> > >> file <- getGEOSuppFiles("GSE17907") > >> > > > >> > > I had no problem at all. > >> > > > >> > > I really do not know what's wrong with the sapply() setting. > >> > > > >> > > Any suggestion? > >> > > > >> > > Thanks a lot for the help! > >> > > > >> > > Ying > >> > > > >> > >> Date: Thu, 2 Feb 2012 12:48:56 -0500 > >> > >> Subject: Re: [BioC] GEOquery Error > >> > >> From: sdavis2 at mail.nih.gov > >> > >> To: ying_chen at live.com > >> > >> CC: bioconductor at r-project.org > >> > > > >> > >> > >> > >> On Thu, Feb 2, 2012 at 12:38 PM, ying chen <ying_chen at="" live.com=""> wrote: > >> > >> > > >> > >> > > >> > >> > > >> > >> > Hi, > >> > >> > > >> > >> > I want to use GEOquery package to get the raw files of a lot GEO > >> > >> > datasets at once ( > files <- sapply(mydata$GSE_ID, getGEOSuppFiles) ), but > >> > >> > I got the following error message when I did a simple test run. Any > >> > >> > suggestion? > >> > >> > > >> > >> > >> > >> Hi, Ying. > >> > >> > >> > >> This is not a GEOquery issue. The directory housing the data is not > >> > >> on the FTP site. NCBI GEO periodically rebuilds stuff on the site. > >> > >> That might be occurring now. I'd suggest emailing NCBI GEO directly > >> > >> if you are in a hurry. Alternatively, wait an hour or two to see if > >> > >> the problem is resolved. > >> > >> > >> > >> Sean > >> > >> > >> > >> > >> > >> >> library(GEOquery) > >> > >> > Loading required package: Biobase > >> > >> > Welcome to Bioconductor > >> > >> > Vignettes contain introductory material. To view, type > >> > >> > 'browseVignettes()'. To cite Bioconductor, see > >> > >> > 'citation("Biobase")' and for packages 'citation("pkgname")'. > >> > >> > Setting options('download.file.method.GEOquery'='curl') > >> > >> >> files <- getGEOSuppFiles("GSE23720") > >> > >> > [1] " ftp://ftp.ncbi.nih.gov/pub/geo/DATA/supplementary/series/GSE23720/" > >> > >> > Error in function (type, msg, asError = TRUE) : > >> > >> > Server denied you to change to the given directory > >> > >> >> sessionInfo() > >> > >> > R version 2.14.1 (2011-12-22) > >> > >> > Platform: x86_64-pc-linux-gnu (64-bit) > >> > >> > locale: > >> > >> > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > >> > >> > [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 > >> > >> > [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 > >> > >> > [7] LC_PAPER=C LC_NAME=C > >> > >> > [9] LC_ADDRESS=C LC_TELEPHONE=C > >> > >> > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > >> > >> > attached base packages: > >> > >> > [1] stats graphics grDevices utils datasets methods base > >> > >> > other attached packages: > >> > >> > [1] GEOquery_2.20.8 Biobase_2.14.0 > >> > >> > loaded via a namespace (and not attached): > >> > >> > [1] RCurl_1.9-5 XML_3.9-4 > >> > >> >> > >> > >> > > >> > >> > > >> > >> > [[alternative HTML version deleted]] > >> > >> > > >> > >> > _______________________________________________ > >> > >> > Bioconductor mailing list > >> > >> > Bioconductor at r-project.org > >> > >> > https://stat.ethz.ch/mailman/listinfo/bioconductor > >> > >> > Search the archives: > >> > >> > http://news.gmane.org/gmane.science.biology.informatics.conductor > >> > >> [[alternative HTML version deleted]] > >> > >> _______________________________________________ > >> Bioconductor mailing list > >> Bioconductor at r-project.org > >> https://stat.ethz.ch/mailman/listinfo/bioconductor > >> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > > > > [[alternative HTML version deleted]] > > > > _______________________________________________ > > Bioconductor mailing list > > Bioconductor at r-project.org > > https://stat.ethz.ch/mailman/listinfo/bioconductor > > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor [[alternative HTML version deleted]] _______________________________________________ Bioconductor mailing list Bioconductor at r-project.org https://stat.ethz.ch/mailman/listinfo/bioconductor Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor The information of this email and in any file transmitted with it is strictly confidential and may be legally privileged. It is intended solely for the addressee. If you are not the intended recipient, any copying, distribution or any other use of this email is prohibited and may be unlawful. In such case, you should please notify the sender immediately and destroy this email. The content of this email is not legally binding unless confirmed by letter. Any views expressed in this message are those of the individual sender, except where the message states otherwise and the sender is authorised to state them to be the views of the sender's company. For further information about Actelion please see our website at http://www.actelion.com
ADD REPLY

Login before adding your answer.

Traffic: 872 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6