AnnotationForge: makeOrgPackageFromNCBI Error in function (type, msg, asError = TRUE) : FTP response timeout
1
0
Entering edit mode
Imad • 0
@imad-24957
Last seen 5 weeks ago

I run makeOrgPackageFromNCBI to create annotation package. the following files are download: [1] gene2pubmed.gz [2] gene2accession.gz [3] gene2refseq.gz [4] gene_info.gz [5] gene2go.gz

Code should be placed in three backticks as shown below

(((makeOrgPackageFromNCBI(version = "0.1",
                       author = "Some One <so@someplace.org>",
    maintainer = "Some One <so@someplace.org>",
                       outputDir = ".",
                       tax_id = "7137",
                       genus = "Galleria",
                       species = "Galleria mellonella",
                       rebuildCache = TRUE)))
# include your problematic code here with any corresponding output 
Output;
If files are not cached locally this may take awhile to assemble a 12 GB cache databse in the NCBIFilesDir directory. Subsequent calls to this function should be faster (seconds). The cache will try to rebuild once per day.
preparing data from NCBI ...
starting download for 
[1] gene2pubmed.gz
[2] gene2accession.gz
[3] gene2refseq.gz
[4] gene_info.gz
[5] gene2go.gz
getting data for gene2pubmed.gz
rebuilding the cache
extracting data for our organism from : gene2pubmed
getting data for gene2accession.gz
rebuilding the cache
extracting data for our organism from : gene2accession
getting data for gene2refseq.gz
rebuilding the cache
extracting data for our organism from : gene2refseq
getting data for gene_info.gz
rebuilding the cache
extracting data for our organism from : gene_info
getting data for gene2go.gz
rebuilding the cache
extracting data for our organism from : gene2go
processing gene2pubmed
processing gene_info: chromosomes
processing gene_info: description
processing alias data
processing refseq data
processing accession data
processing GO data
Error in function (type, msg, asError = TRUE)  : FTP response timeout
In addition: Warning messages:
1: In result_fetch(res@ptr, n = n) :
  SQL statements must be issued with dbExecute() or dbSendStatement() instead of dbGetQuery() or dbSendQuery().
2: call dbDisconnect() when finished working with a connection 
3: In result_fetch(res@ptr, n = n) :
  SQL statements must be issued with dbExecute() or dbSendStatement() instead of dbGetQuery() or dbSendQuery().
4: In result_fetch(res@ptr, n = n) :
  SQL statements must be issued with dbExecute() or dbSendStatement() instead of dbGetQuery() or dbSendQuery().
5: In result_fetch(res@ptr, n = n) :
  SQL statements must be issued with dbExecute() or dbSendStatement() instead of dbGetQuery() or dbSendQuery().
6: In result_fetch(res@ptr, n = n) :
  SQL statements must be issued with dbExecute() or dbSendStatement() instead of dbGetQuery() or dbSendQuery().
# please also include the results of running the following in an R session 

sessionInfo( )
> sessionInfo()
R version 4.0.2 (2020-06-22)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252 
[2] LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils    
[7] datasets  methods   base     

other attached packages:
[1] AnnotationForge_1.32.0 AnnotationDbi_1.52.0   IRanges_2.24.1        
[4] S4Vectors_0.28.1       Biobase_2.50.0         BiocGenerics_0.36.0   

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.6          XML_3.99-0.5        bitops_1.0-6       
 [4] DBI_1.1.1           RSQLite_2.2.3       cachem_1.0.4       
 [7] rlang_0.4.10        blob_1.2.1          vctrs_0.3.6        
[10] tools_4.0.2         bit64_4.0.5         RCurl_1.98-1.2     
[13] bit_4.0.4           fastmap_1.1.0       yaml_2.2.1         
[16] compiler_4.0.2      pkgconfig_2.0.3     BiocManager_1.30.10
[19] memoise_2.0.0      
>
AnnotationForge • 137 views
ADD COMMENT
0
Entering edit mode

Mine is still running after ~12 hours but is stalled on the 'processing GO data' step. There is a NCBI.sqlite file of ~32GB prepared, and all of the other typical files (gene2accession.gz, gene2go.gz, et cetera). I'll let you know if it ever finishes or returns a time-out error.

ADD REPLY
0
Entering edit mode

Okay, in my case, I ran out of memory, but I never received any FTP timeout error. So, it should finish eventually.

ADD REPLY
0
Entering edit mode
@james-w-macdonald-5106
Last seen 8 hours ago
United States

The comments from .downloadAndPopulateAltGOData might be instructive here.

.downloadAndPopulateAltGOData <-
    function(NCBIcon, NCBIFilesDir, rebuildCache)
{
    dest <- file.path(NCBIFilesDir, "idmapping_selected.tab.gz")
    if (rebuildCache) {
        #  This url has been flaky in the past
        #  See https://www.uniprot.org/downloads
        #  Troublshooting in the past involved temporarily changing this url
        #     to use the https protcol url:
        #     https://ftp.expasy.org/databases/uniprot/current_release/knowledgebase/idmapping/idmapping_selected.tab.gz
        url <- "ftp://ftp.uniprot.org/pub/databases/uniprot/current_release/knowledgebase/idmapping/idmapping_selected.tab.gz"
        loadNamespace("RCurl")
        f <- RCurl::CFILE(dest, mode="wb")
        RCurl::curlPerform(url = url, writedata = f@ref)

So you may be hanging on the ftp download. You could try one of two things. First, just download that idmapping_select.tab.gz file and then use rebuildCache = FALSE so you don't re-download all that stuff. Second, you could get the sources for AnnotationForge, change the URI to the expasy site, install, and try again. I'll leave it to the reader to decide which sounds like less work. ;-D

Login before adding your answer.

Traffic: 328 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6