Tutorial:Patch for GEOquery package error: getGEO Error in download.file, cannot open destfile
1
2
Entering edit mode
@martinguerrerog89-14839
Last seen 5.7 years ago

Recently the getGEO function of the package GEOquery suddenly started throwing the following error:

 

   gse=getGEO("GSE106977",GSEMatrix=T)

    #https://ftp.ncbi.nlm.nih.gov/geo/series/GSE106nnn/GSE106977/matrix/
    #OK
    #Found 2 file(s)
    #/geo/series/GSE106nnn/GSE106977/
    #Error in download.file(sprintf("https://ftp.ncbi.nlm.nih.gov/geo/series/%s/%s/matrix/%s", ; :
    #cannot open destfile 'C:\Users\Marti\AppData\Local\Temp\Rtmp8cEqMH//geo/series/GSE106nnn/GSE106977',    reason 'No such file or directory'

After some debuging I found that there was an error in the getAndParseGSEMatrices hidden function

To overcome this issue I made a patch, that fix the issue.

First copy this code into Rstudio or worpad

    getAndParseGSEMatrices=function (GEO, destdir, AnnotGPL, getGPL = TRUE) 
    {
      GEO <- toupper(GEO)
      stub = gsub("\d{1,3}$", "nnn", GEO, perl = TRUE)
      gdsurl <- "https://ftp.ncbi.nlm.nih.gov/geo/series/%s/%s/matrix/";
      b = getDirListing(sprintf(gdsurl, stub, GEO))
      b=b[-1]
      message(sprintf("Found %d file(s)", length(b)))
      ret <- list()
      for (i in 1:length(b)) {
        message(b[i])
        destfile = file.path(destdir, b[i])
        if (file.exists(destfile)) {
          message(sprintf("Using locally cached version: %s", destfile))
        }
        else {
          download.file(sprintf("https://ftp.ncbi.nlm.nih.gov/geo/series/%s/%s/matrix/%s", ;
                                stub, GEO, b[i]), destfile = destfile, mode = "wb", 
                        method = getOption("download.file.method.GEOquery"))
        }
        ret[[b[i]]] <- parseGSEMatrix(destfile, destdir = destdir, 
                                      AnnotGPL = AnnotGPL, getGPL = getGPL)$eset
      }
      return(ret)
    }
    environment(getAndParseGSEMatrices)<-asNamespace("GEOquery")
    assignInNamespace("getAndParseGSEMatrices", getAndParseGSEMatrices, ns="GEOquery")

Save the file as  GEOpatch.R  in your working directory and

then, when loading the GEOquery library, source the saved file:

    library(GEOquery)
    source("GEOpatch.R")

Now it should get going...

    gse=getGEO("GSE106977",GSEMatrix=T)
    #https://ftp.ncbi.nlm.nih.gov/geo/series/GSE106nnn/GSE106977/matrix/
    #OK
    #Found 1 file(s)
    #GSE106977_series_matrix.txt.gz
    #trying URL 'https://ftp.ncbi.nlm.nih.gov/geo/series/GSE106nnn/GSE106977/matrix/GSE106977_series_matrix.txt.gz';
    #Content type 'application/x-gzip' length 32707196 bytes (31.2 MB)

 

sessionInfo()
R version 3.3.2 (2016-10-31)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 16299)

locale:
[1] LC_COLLATE=English_United States.1252 
[2] LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods  
[8] base     

other attached packages:
[1] GEOquery_2.40.0      Biobase_2.34.0       BiocGenerics_0.20.0 
[4] BiocInstaller_1.24.0

loaded via a namespace (and not attached):
[1] httr_1.3.1      R6_2.2.2        tools_3.3.2     RCurl_1.95-4.10
[5] bitops_1.0-6    XML_3.98-1.9   
> 
R GEO geoquery geoquery error message Tutorial • 4.9k views
ADD COMMENT
1
Entering edit mode

Can you post the output of `sessionInfo()`? I suspect you are using an outdated version of GEOquery. 

ADD REPLY
0
Entering edit mode

While it can be tempting to suggest to the community that cutting-and-pasting a "patch" into a seemingly problematic package, please refrain from doing so, as it negates the benefits of version control including reproducibility in the software workflow and sidesteps the Bioconductor testing and checking mechanisms. In this case, the best approach is to note the bug either on this site or, better, for GEOquery, to file an issue on GitHub. If you have code to contribute, "pull requests" are more than welcome. 

ADD REPLY
0
Entering edit mode

You areright @Sean, I did this in the meanwhile to help some co-workers that had the issue too, even after a fresh install of the package, I will write the issue in the GitHub repository and send the pull request, also I have updated the sessionInfo(), my bad!

Great package by the way!!

 

ADD REPLY
0
Entering edit mode

Thanks for the kind words. If, after you have updated to R-3.4.3, you still see the error, definitely file an issue. I suspect that upgrading R and then re-installing the current release version of GEOquery will fix the issue. In general, we do not fix outdated versions of Bioconductor but, instead, recommend that folks upgrade.

ADD REPLY
0
Entering edit mode
@juanmafernandezm86-14840
Last seen 2.6 years ago
Argentina

very useful, it worked..!! 

ADD COMMENT

Login before adding your answer.

Traffic: 480 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6