AnnotationForge: makeOrgPackageFromNCBI Error in function (type, msg, asError = TRUE) : FTP response timeout
Imad • 0
Last seen 24 months ago

I run makeOrgPackageFromNCBI to create annotation package. the following files are download: [1] gene2pubmed.gz [2] gene2accession.gz [3] gene2refseq.gz [4] gene_info.gz [5] gene2go.gz

(((makeOrgPackageFromNCBI(version = "0.1",
                       author = "Some One <>",
    maintainer = "Some One <>",
                       outputDir = ".",
                       tax_id = "7137",
                       genus = "Galleria",
                       species = "Galleria mellonella",
                       rebuildCache = TRUE)))
If files are not cached locally this may take awhile to assemble a 12 GB cache databse in the NCBIFilesDir directory. Subsequent calls to this function should be faster (seconds). The cache will try to rebuild once per day.
preparing data from NCBI ...
starting download for 
[1] gene2pubmed.gz
[2] gene2accession.gz
[3] gene2refseq.gz
[4] gene_info.gz
[5] gene2go.gz
getting data for gene2pubmed.gz
rebuilding the cache
extracting data for our organism from : gene2pubmed
getting data for gene2accession.gz
rebuilding the cache
extracting data for our organism from : gene2accession
getting data for gene2refseq.gz
rebuilding the cache
extracting data for our organism from : gene2refseq
getting data for gene_info.gz
rebuilding the cache
extracting data for our organism from : gene_info
getting data for gene2go.gz
rebuilding the cache
extracting data for our organism from : gene2go
processing gene2pubmed
processing gene_info: chromosomes
processing gene_info: description
processing alias data
processing refseq data
processing accession data
processing GO data
Error in function (type, msg, asError = TRUE)  : FTP response timeout
In addition: Warning messages:
1: In result_fetch(res@ptr, n = n) :
  SQL statements must be issued with dbExecute() or dbSendStatement() instead of dbGetQuery() or dbSendQuery().
2: call dbDisconnect() when finished working with a connection 
3: In result_fetch(res@ptr, n = n) :
  SQL statements must be issued with dbExecute() or dbSendStatement() instead of dbGetQuery() or dbSendQuery().
4: In result_fetch(res@ptr, n = n) :
  SQL statements must be issued with dbExecute() or dbSendStatement() instead of dbGetQuery() or dbSendQuery().
5: In result_fetch(res@ptr, n = n) :
  SQL statements must be issued with dbExecute() or dbSendStatement() instead of dbGetQuery() or dbSendQuery().
6: In result_fetch(res@ptr, n = n) :
  SQL statements must be issued with dbExecute() or dbSendStatement() instead of dbGetQuery() or dbSendQuery().
sessionInfo( )
> sessionInfo()
R version 4.0.2 (2020-06-22)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

Matrix products: default

[1] LC_COLLATE=English_United States.1252 
[2] LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils    
[7] datasets  methods   base     

other attached packages:
[1] AnnotationForge_1.32.0 AnnotationDbi_1.52.0   IRanges_2.24.1        
[4] S4Vectors_0.28.1       Biobase_2.50.0         BiocGenerics_0.36.0   

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.6          XML_3.99-0.5        bitops_1.0-6       
 [4] DBI_1.1.1           RSQLite_2.2.3       cachem_1.0.4       
 [7] rlang_0.4.10        blob_1.2.1          vctrs_0.3.6        
[10] tools_4.0.2         bit64_4.0.5         RCurl_1.98-1.2     
[13] bit_4.0.4           fastmap_1.1.0       yaml_2.2.1         
[16] compiler_4.0.2      pkgconfig_2.0.3     BiocManager_1.30.10
[19] memoise_2.0.0      
Mine is still running after ~12 hours but is stalled on the 'processing GO data' step. There is a NCBI.sqlite file of ~32GB prepared, and all of the other typical files (gene2accession.gz, gene2go.gz, et cetera). I'll let you know if it ever finishes or returns a time-out error.

Okay, in my case, I ran out of memory, but I never received any FTP timeout error. So, it should finish eventually.

Last seen 12 hours ago
United States

The comments from .downloadAndPopulateAltGOData might be instructive here.

.downloadAndPopulateAltGOData <-
    function(NCBIcon, NCBIFilesDir, rebuildCache)
    dest <- file.path(NCBIFilesDir, "")
    if (rebuildCache) {
        #  This url has been flaky in the past
        #  See
        #  Troublshooting in the past involved temporarily changing this url
        #     to use the https protcol url:
        url <- ""
        f <- RCurl::CFILE(dest, mode="wb")
        RCurl::curlPerform(url = url, writedata = f@ref)

So you may be hanging on the ftp download. You could try one of two things. First, just download that file and then use rebuildCache = FALSE so you don't re-download all that stuff. Second, you could get the sources for AnnotationForge, change the URI to the expasy site, install, and try again. I'll leave it to the reader to decide which sounds like less work. ;-D


