Problem fetching resources from AnnotationHub, error 403
1
0
Entering edit mode
Johannes Rainer ★ 2.0k
@johannes-rainer-6987
Last seen 6 weeks ago
Italy

Dear all,

I'm currently struggling to fetch data from AnnotationHub:

> library(AnnotationHub)
> ah <- AnnotationHub()
snapshotDate(): 2015-08-26
> query(ah, c("Homo sapiens", "release-81"))
AnnotationHub with 7 records
# snapshotDate(): 2015-08-26
# $dataprovider: Ensembl
# $species: Homo sapiens
# $rdataclass: FaFile, GRanges
# additional mcols(): taxonomyid, genome, description, tags, sourceurl,
#   sourcetype
# retrieve records with, e.g., 'object[["AH47963"]]'

            title                                 
  AH47963 | Homo_sapiens.GRCh38.81.gtf            
  AH49183 | Homo_sapiens.GRCh38.cdna.all.fa       
  AH49184 | Homo_sapiens.GRCh38.dna_rm.toplevel.fa
  AH49185 | Homo_sapiens.GRCh38.dna_sm.toplevel.fa
  AH49186 | Homo_sapiens.GRCh38.dna.toplevel.fa   
  AH49187 | Homo_sapiens.GRCh38.ncrna.fa          
  AH49188 | Homo_sapiens.GRCh38.pep.all.fa        

> Dna <- ah[["AH49186"]]
downloading from ‘https://annotationhub.bioconductor.org/fetch/55651’
    ‘https://annotationhub.bioconductor.org/fetch/55652’
retrieving 2 resources
Downloading: 240 B     
Downloading: 240 B     
Error: failed to load 'AnnotationHub' resource
  name: AH49186
  title: Homo_sapiens.GRCh38.dna.toplevel.fa
  reason: 2 resources failed to download
In addition: There were 50 or more warnings (use warnings() to see the first 50)


and some lines from the warnings:

31: In curl::curl_fetch_disk(url, x$path, handle = handle) :
  progress callback must return boolean
32: In curl::curl_fetch_disk(url, x$path, handle = handle) :
  progress callback must return boolean
33: download failed
  hub path: ‘https://annotationhub.bioconductor.org/fetch/55651’
  cache path: ‘/Users/jo/~/.AnnotationHub/55651’
  reason: client error: (403) Forbidden
34: In curl::curl_fetch_disk(url, x$path, handle = handle) :
  progress callback must return boolean

My R:

> sessionInfo()
R version 3.2.2 (2015-08-14)
Platform: x86_64-apple-darwin15.0.0/x86_64 (64-bit)
Running under: OS X 10.11.2 (El Capitan)

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets
[8] methods   base     

other attached packages:
[1] Rsamtools_1.22.0     Biostrings_2.38.0    XVector_0.10.0      
[4] GenomicRanges_1.22.0 GenomeInfoDb_1.6.1   IRanges_2.4.1       
[7] S4Vectors_0.8.0      AnnotationHub_2.2.1  BiocGenerics_0.16.0

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.1                  AnnotationDbi_1.32.0        
 [3] magrittr_1.5                 zlibbioc_1.16.0             
 [5] BiocParallel_1.4.0           xtable_1.8-0                
 [7] R6_2.1.1                     stringr_1.0.0               
 [9] httr_1.0.0                   tools_3.2.2                 
[11] Biobase_2.30.0               DBI_0.3.1                   
[13] lambda.r_1.1.7               futile.logger_1.4.1         
[15] htmltools_0.2.6              digest_0.6.8                
[17] interactiveDisplayBase_1.8.0 shiny_0.12.2                
[19] futile.options_1.0.0         bitops_1.0-6                
[21] curl_0.9.3                   RSQLite_1.0.0               
[23] mime_0.4                     stringi_1.0-1               
[25] BiocInstaller_1.20.0         httpuv_1.3.3    

Could it be that the servers are down or not accessible?

Thanks, jo

AnnotationHub • 1.7k views
ADD COMMENT
0
Entering edit mode

Somehow this has to do with the genome fasta files of this Ensembl release, as I can fetch the gtf and can also fetch the dna.toplevel.fa file for Ensembl 80. Are these files (I mean the genome fasta files for Ensembl 81) eventually corrupt?
 

ADD REPLY
1
Entering edit mode

Yes, some of the ensemble 81 fasta files were not uploaded successfully; this will  be addressed. I wonder if these are actually different from the Ensembl 80 files?

We've also been working on representing these differently, as 2bit files for more robust, compressed manipulation. Any thoughts?

ADD REPLY
0
Entering edit mode

Great, thanks!

Indeed, it might be that they are the same as the one in Ensembl 80... haven't checked.

ADD REPLY
0
Entering edit mode

by the way, are there plans to include more recent Ensembl releases too?

ADD REPLY
0
Entering edit mode

Yes, we had hoped to stay current more-or-less immediately, but the hiccup above and other issues have distracted us.

ADD REPLY
0
Entering edit mode

Hi Martin,

My understanding is that, for a given organism, the FASTA file changes only when the reference genome build changes. So at each new Ensembl release, the FASTA files for those organisms for which Ensembl uses a new genome build will change. Some organisms have a very stable reference genome (new build every 4 or 5 years only) but others don't (new build every year or more).

H.

ADD REPLY
1
Entering edit mode
@valerie-obenchain-4275
Last seen 2.2 years ago
United States

Hi,

Thanks for reporting this. The problem was that metadata for all records were inserted in the db but not all data files were pushed to their final (S3 bucket) location. This has been fixed and all ensembl fasta 81 files should now be available in release and devel. 

Valerie

ADD COMMENT
0
Entering edit mode

that rocks! thanks!

cheers, jo

ADD REPLY

Login before adding your answer.

Traffic: 697 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6