failed to load 'AnnotationHub' resource
2
1
Entering edit mode
elhananby ▴ 10
@elhananby-9276
Last seen 9.0 years ago

Hey,

Everytime I try to load a resource from annotationhub (following this guide), I get this error:

downloading from ‘https://annotationhub.bioconductor.org/fetch/52446’
    ‘https://annotationhub.bioconductor.org/fetch/52447’
retrieving 2 resources
Error: failed to load 'AnnotationHub' resource
  name: AH47004
  title: clinvar.vcf.gz from dbSNP, GRCh37 assembly
  reason: 2 resources failed to download
In addition: There were 50 or more warnings (use warnings() to see the first 50)
<font face="sans-serif, Arial, Verdana, Trebuchet MS">
</font>Warning messages:
1: In curl::curl_fetch_disk(url, x$path, handle = handle) :
  progress callback must return boolean

With the error message repeated for 50 times. Any idea what the problem could be?

Thanks

annotationhub annotation error • 3.9k views
ADD COMMENT
0
Entering edit mode
@martin-morgan-1513
Last seen 4 months ago
United States

The first part of the answer is that the warnings are coming from the currently available version of the httr package, which has a bug. Update to the version available in github

biocLite("hadley/httr")

(this might require first installing 'devtools', biocLite("devtools')).

The attempt to access the resource will then return more informatively

> hub[['AH47004']]
require("VariantAnnotation")
downloading from 'https://annotationhub.bioconductor.org/fetch/52446'
    'https://annotationhub.bioconductor.org/fetch/52447'
retrieving 2 resources
Error: failed to load 'AnnotationHub' resource
  name: AH47004
  title: clinvar.vcf.gz from dbSNP, GRCh37 assembly
  reason: 2 resources failed to download
In addition: Warning messages:
1: download failed
  hub path: 'https://annotationhub.bioconductor.org/fetch/52446'
  cache path: '/home/mtmorgan/.AnnotationHub/52446'
  reason: Access denied to remote resource 
2: download failed
  hub path: 'https://annotationhub.bioconductor.org/fetch/52447'
  cache path: '/home/mtmorgan/.AnnotationHub/52447'
  reason: Access denied to remote resource 

AnnotationHub forwards the request to the original source

> hub['AH47004']$sourceurl
[1] "ftp://ftp.ncbi.nih.gov/snp/organisms/human_9606/VCF/clinical_vcf_set/clinvar.vcf.gz"

and manually trying to visit that url, e.g., in a web browser, tells us that the resource is no longer available. 

The workflow could be updated with other AnnotationHub resources. For instance

> mcols(query(hub, "clinvar.vcf"))[,"sourceurl", drop=FALSE]
DataFrame with 3 rows and 1 column
                                                                                                 sourceurl
                                                                                               <character>
AH47004                ftp://ftp.ncbi.nih.gov/snp/organisms/human_9606/VCF/clinical_vcf_set/clinvar.vcf.gz
AH47022 ftp://ftp.ncbi.nih.gov/snp/organisms/human_9606_b142_GRCh37p13/VCF/clinical_vcf_set/clinvar.vcf.gz
AH47030    ftp://ftp.ncbi.nih.gov/snp/organisms/human_9606_b142_GRCh38/VCF/clinical_vcf_set/clinvar.vcf.gz

The latter two resources are still available. We will investigate whether the first is permanently no longer available, and if so update the hub to mark these records as no longer available, and will update the workflow to use alternative resources. Thanks for the heads up; we would like this process of identifying old resources to be more automatic.

 

ADD COMMENT
0
Entering edit mode
elhananby ▴ 10
@elhananby-9276
Last seen 9.0 years ago

Hey Martin, thanks for the answer!

But now I have a different problem - when running:

AnnotationHub()[['AH47030']]

I get this error:

snapshotDate(): 2015-11-19
downloading from ‘https://annotationhub.bioconductor.org/fetch/52498’
    ‘https://annotationhub.bioconductor.org/fetch/52499’
retrieving 2 resources
Error: failed to load 'AnnotationHub' resource
  name: AH47030
  title: clinvar.vcf.gz from dbSNP, GRCh38_b142 assembly
  reason: 2 resources failed to download
In addition: There were 15 warnings (use warnings() to see them)

Followed by:

Warning messages:
1: 'AnnotationHub' database may not be current
  database: ‘C:/Users/ElhananBY/Documents/AppData/.AnnotationHub/annotationhub.sqlite3’
  reason: Problem with the SSL CA cert (path? access rights?)
2: In curl::curl_fetch_disk(url, x$path, handle = handle) :
  progress callback must return boolean
...
7: In curl::curl_fetch_disk(url, x$path, handle = handle) :
  progress callback must return boolean
8: download failed
  hub path: ‘https://annotationhub.bioconductor.org/fetch/52498’
  cache path: ‘C:/Users/ElhananBY/Documents/AppData/.AnnotationHub/52498’
  reason: Problem with the SSL CA cert (path? access rights?)
9: In curl::curl_fetch_disk(url, x$path, handle = handle) :
  progress callback must return boolean
...
14: In curl::curl_fetch_disk(url, x$path, handle = handle) :
  progress callback must return boolean
15: download failed
  hub path: ‘https://annotationhub.bioconductor.org/fetch/52499’
  cache path: ‘C:/Users/ElhananBY/Documents/AppData/.AnnotationHub/52499’
  reason: Problem with the SSL CA cert (path? access rights?)

Also, when trying to insert AnnotationHub() to a variable:

> hub = AnnotationHub()
Warning message:
'AnnotationHub' database may not be current
  database: ‘C:/Users/ElhananBY/Documents/AppData/.AnnotationHub/annotationhub.sqlite3’
  reason: Problem with the SSL CA cert (path? access rights?) 

Further, the last two sources don't work for me:

AH47022 ftp://ftp.ncbi.nih.gov/snp/organisms/human_9606_b142_GRCh37p13/VCF/clinical_vcf_set/clinvar.vcf.gz
AH47030    ftp://ftp.ncbi.nih.gov/snp/organisms/human_9606_b142_GRCh38/VCF/clinical_vcf_set/clinvar.vcf.gz

The address seems to be wrong - it shouldn't contain "clinical_vcf_set", but rather just be:

ftp://ftp.ncbi.nih.gov/snp/organisms/human_9606_b142_GRCh38/VCF/clinvar-latest.vcf.gz

So now I just manually downloaded the vcf.gz file and used readvcf to input it, which seems to work.

 

Thanks

 

 

ADD COMMENT
0
Entering edit mode

Just a follow up to let you know this is fixed in AnnotationHub 2.3.16 in devel and 2.2.5 in release. Both should be available with biocLite() tomorrow ~noon PST or immediately from svn:

svn co https://hedgehog.fhcrc.org/bioconductor/trunk/madman/Rpacks/AnnotationHub

svn co https://hedgehog.fhcrc.org/bioconductor/branches/RELEASE_3_2/madman/Rpacks/AnnotationHub

Records for the old files (moved, changed, no longer available, etc.) have been removed and replacements for GRCh37 and GRCh38 have been added. The snapshot date you want is 2016-03-09.

> hub <- AnnotationHub()
snapshotDate(): 2016-03-09
> dbSNP <- query(hub, c("dbSNP", "VCF"))
> dbSNP
AnnotationHub with 8 records
# snapshotDate(): 2016-03-09 
# $dataprovider: dbSNP
# $species: Homo sapiens
# $rdataclass: VcfFile
# additional mcols(): taxonomyid, genome, description, tags, sourceurl,
#   sourcetype 
# retrieve records with, e.g., 'object[["AH50420"]]' 

            title                                         
  AH50420 | clinvar_20160203.vcf.gz                       
  AH50421 | clinvar_20160203_papu.vcf.gz                  
  AH50422 | common_and_clinical_20160203.vcf.gz           
  AH50423 | common_no_known_medical_impact_20160203.vcf.gz
  AH50424 | clinvar_20160203.vcf.gz                       
  AH50425 | clinvar_20160203_papu.vcf.gz                  
  AH50426 | common_and_clinical_20160203.vcf.gz           
  AH50427 | common_no_known_medical_impact_20160203.vcf.gz

> vcf <- dbSNP[[2]]
downloading from ‘https://annotationhub.bioconductor.org/fetch/57152’
    ‘https://annotationhub.bioconductor.org/fetch/57153’
retrieving 2 resources
  |======================================================================| 100%
  |======================================================================| 100%

> vcf
class: VcfFile 
path: /home/vobencha/.AnnotationHub/57152
index: /home/vobencha/.AnnotationHub/57153
isOpen: FALSE 
yieldSize: NA 

This is what you'll see when trying to access a resource that's no longer available:

> hub[["AH47030"]]
Error in .AnnotationHub_get1(x[idx]) : 
  no records found for the given index
> hub["AH47030"]
AnnotationHub with 0 records
# snapshotDate(): 2016-03-09 

Maybe we could make that first message even more informative by adding the date removed (if any).

Valerie

 

ADD REPLY
0
Entering edit mode

Still a problem for me. 

hub = AnnotationHub()
snapshotDate(): 2016-03-09

> gse = query(hub, "GSE62944")[[1]]

Error: failed to load 'AnnotationHub' resource
  name: AH28855
  title: RNA-Sequencing and clinical data for 7706 tumor samples from The Cancer Genome Atlas
  reason: 1 resources failed to download
In addition: Warning message:
download failed
  hub path: ‘https://annotationhub.bioconductor.org/fetch/34295’
  cache path: ‘C:/LispHome/AppData/.AnnotationHub/34295’
  reason: Forbidden (HTTP 403). 

Any tips?

ADD REPLY
0
Entering edit mode

Thanks for reporting this. Just to clarify, your original question was about accessing clinvar files; these have been re-generated and are now available. Now you're asking about accessing the GSE62944 TCGA data, this is a different resource unrelated to the original problem. It's helpful (for me) to keep problems with the different resources in separate posts.

The GSE62944 data is in an S3 bucket. For some reason access was restricted (not public). I have fixed that and the resource is now available:

hub = AnnotationHub()
snapshotDate(): 2016-03-09
> gse = query(hub, "GSE62944")

> gse[[1]]
downloading from ‘https://annotationhub.bioconductor.org/fetch/34295’
retrieving 1 resource
  |======================================================================| 100%
ExpressionSet (storageMode: lockedEnvironment)
assayData: 23368 features, 7706 samples 
  element names: exprs 
protocolData: none
phenoData
  sampleNames: TCGA-02-0047-01A-01R-1849-01
    TCGA-02-0055-01A-01R-1849-01 ... TCGA-ZG-A8QZ-01A-11R-A37L-07 (7706
    total)
  varLabels: bcr_patient_barcode bcr_patient_uuid ... CancerType (421
    total)
  varMetadata: labelDescription
featureData: none
experimentData: use 'experimentData(object)'
Annotation:  

 

 

In the future please open a new issue if the data are not the same type as referenced in the original.

Valerie

 

ADD REPLY
0
Entering edit mode

Thank you so much. Works now

ADD REPLY

Login before adding your answer.

Traffic: 751 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6