error with downloading AnnotationHub item AH134
1
0
Entering edit mode
@robert-k-bradley-5997
Last seen 4 weeks ago
United States

Hello,

Downloading the AnnotationHub object AH134 (the hg19 genome) does not currently work, even though the source URL for that object exists and can be manually downloaded. Here is a reproducible code example:

> library (AnnotationHub)
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: ‘BiocGenerics’

The following objects are masked from ‘package:parallel’:

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ, clusterExport, clusterMap, parApply, parCapply, parLapply, parLapplyLB, parRapply, parSapply,
    parSapplyLB

The following objects are masked from ‘package:stats’:

    IQR, mad, xtabs

The following objects are masked from ‘package:base’:

    anyDuplicated, append, as.data.frame, cbind, colnames, do.call, duplicated, eval, evalq, Filter, Find, get, grep, grepl, intersect, is.unsorted, lapply,
    lengths, Map, mapply, match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, Position, rank, rbind, Reduce, rownames, sapply, setdiff, sort, table,
    tapply, union, unique, unsplit, which, which.max, which.min

> hub = AnnotationHub()
snapshotDate(): 2016-10-11

> hub[["AH134"]]
require(“Rsamtools”)
downloading from ‘https://annotationhub.bioconductor.org/fetch/134’
    ‘https://annotationhub.bioconductor.org/fetch/14095’
retrieving 2 resources
Downloading: 240 B    
Downloading: 240 B    
Error: failed to load resource
  name: AH134
  title: Homo_sapiens.GRCh37.69.dna.toplevel.fa
  reason: 2 resources failed to download
In addition: Warning messages:
1: download failed
  hub path: ‘https://annotationhub.bioconductor.org/fetch/134’
  cache path: ‘/Users/rbradley//.AnnotationHub/134’
  reason: Forbidden (HTTP 403).
2: download failed
  hub path: ‘https://annotationhub.bioconductor.org/fetch/14095’
  cache path: ‘/Users/rbradley//.AnnotationHub/14095’
  reason: Forbidden (HTTP 403).

> hub["AH134"]$sourceurl
[1] "ftp://ftp.ensembl.org/pub/release-69/fasta/homo_sapiens/dna/Homo_sapiens.GRCh37.69.dna.toplevel.fa.gz"

I confirmed that Bioconductor itself and all packages were updated. Here is my session info:

> sessionInfo()
R version 3.3.1 (2016-06-21)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X 10.11.6 (El Capitan)

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats4    parallel  grDevices utils     datasets  stats     graphics  methods   base     

other attached packages:
[1] Rsamtools_1.26.0     Biostrings_2.42.0    XVector_0.14.0       GenomicRanges_1.26.0 GenomeInfoDb_1.10.0  IRanges_2.8.0        S4Vectors_0.12.0     AnnotationHub_2.6.0
[9] BiocGenerics_0.20.0

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.7                   AnnotationDbi_1.36.0          zlibbioc_1.20.0               BiocParallel_1.8.0            xtable_1.8-2                 
 [6] R6_2.2.0                      httr_1.2.1                    Biobase_2.34.0                DBI_0.5-1                     htmltools_0.3.5              
[11] digest_0.6.10                 interactiveDisplayBase_1.12.0 shiny_0.14.1                  bitops_1.0-6                  curl_2.1                     
[16] RSQLite_1.0.0                 mime_0.5                      BiocInstaller_1.24.0          httpuv_1.3.3                 

 

AnnotationHub • 1.1k views
ADD COMMENT
0
Entering edit mode
@valerie-obenchain-4275
Last seen 2.3 years ago
United States

Hi Robert,

Dan helped me sort this one out. It's due to an error I made a few months ago when I accidentally deleted a few ensembl 69 fastas from the hub. 

Looking at one of the problem urls:

https://annotationhub.bioconductor.org/fetch/134

curl -I https://annotationhub.bioconductor.org/fetch/134

shows that it redirects to:

http://s3.amazonaws.com/annotationhub/ensembl/release-69/fasta/homo_sapiens/dna/Homo_sapiens.GRCh37.69.dna.toplevel.fa.rz

However, looking in that folder in S3 there is no file with an .rz extension. The processing of these fastas involves downloading the .gz, unzipping, indexing and re-compressing as .rz. In our haste to replace the deleted files we accidentally uploaded the .gz files as replacements instead of the .rz files. 

I will rebuild the .rz files for Ensembl 69 but it probably won't happen until next week. Do you need version 69? None of the other versions >= 70 have this problem. GRCh37 dna toplevel is available in several other versions ...

Valerie

ADD COMMENT
0
Entering edit mode

Hi Valerie,

Thank you for looking into this so quickly. I'd like to use the version 69 file if possible in order to ensure continuity with previous analyses that we've conducted using that version. (I know that the assembly should presumably be stable w.r.t. the version, but just to be sure I'll stick with version 69.) I have a backup copy of the downloaded AH134 file that I can use until the fix is available.

Thank you again for the quick help, and for creating such a fantastic resource for the community.

Best,

Rob

ADD REPLY
0
Entering edit mode

I've regenerated the release 69 fastas and "AH134" is available again. Let me know if you run into other problems.

Thanks for your patience!

Valerie

ADD REPLY
0
Entering edit mode

Wonderful. Thank you!

ADD REPLY

Login before adding your answer.

Traffic: 572 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6