Search
Question: Annotation Hub object no longer present
0
gravatar for anamabrantesc
11 months ago by
anamabrantesc0 wrote:

I used Annotation Hub  on August 13, 2016, and I had a recent snapshot of Annotation hub. I used an object AH48041, an OrgDb for Ricinus communis from InParanoid. Today I'm writing Material and Methods for a paper on that analysis, and Annotation Hub was updated, and this object no longer exists... I'm a newcomer to Bioconductor, I didn't download the structure, I don't know how to recover this.  All of my analysis was based on it, I still have the results, but it's really bad if the original structure is no longer available. It is not reproducible. How can I use an old version of the Annotation Hub? How can I obtain this structure? Please help it is really important.  Ana Coelho,  anamabrantesc@gmail.com

 

ADD COMMENTlink modified 11 months ago • written 11 months ago by anamabrantesc0
0
gravatar for Valerie Obenchain
11 months ago by
Valerie Obenchain ♦♦ 6.4k
United States
Valerie Obenchain ♦♦ 6.4k wrote:

Hi Ana,

The OrgDb packages should be re-built before each release (every 6 months) and represent the most current data for that period. Unfortunately, the re-build did not happen for the OrgDb packages in AnnotationHub for the Fall 2015 and Spring 2016 releases which means the packages grew stale. For the October 2016 release, the old OrgDb packages were retired with the past Bioconductor versions and new OrgDb packages were built (not all organisms but many).

To show you what I mean we'll look at the metadata for AnnotationHub. When you invoke AnnotationHub() the metadata is downloaded to your local system as an sqlite db. All records are present in the db but only the ones appropriate for your version of Bioconductor are exposed in the 'hub' object (dates and other factors are used as filters).

Here we look at the metadata for record AH48041 and see it was added in 2015.

> hub <- AnnotationHub()
snapshotDate(): 2016-11-15
> con <- dbconn(hub)
> library(DBI)
> dbGetQuery(con, "select ah_id,title,rdatadateadded from resources where ah_id='AH48041'")
    ah_id                          title rdatadateadded
1 AH48041 org.Ricinus_communis.eg.sqlite     2015-07-27

The fact that people using Bioconductor 3.4/3.5 can no longer get an OrgDb package built in 2015 is a good thing. We don't want that. If you need an older version of an OrgDb you need to use the corresponding R/Bioconductor version. If you did your analysis in August 2016 you were using Bioconductor 3.3 (possibly 3.2 if you were a release behind). To reproduce results for your paper you want not only the same AnnotationHub you originally used but the other R/Bioconductor packages as well; to get these you need to use the same R/Bioconductor you did at the time of your analysis.

All Bioconductor releases are listed here:
http://www.bioconductor.org/about/release-announcements/

You could use an AMI for version 3.3 if you don't want to install R/Bioconductor yourself:
http://www.bioconductor.org/help/bioconductor-cloud-ami/#ami_ids

The biocLite() available with Bioconductor 3.3 will install the appropriate version of AnnotationHub that will include the old OrgDb package. You can get the OrgDb you need for your paper. If you have problems installing Bioconductor 3.3 feel free to contact me off line valerie.obenchain@roswellpark.org and I can help you though it.

Valerie

ADD COMMENTlink written 11 months ago by Valerie Obenchain ♦♦ 6.4k
0
gravatar for anamabrantesc
11 months ago by
anamabrantesc0 wrote:

Dear Valerie, thank you SO much for giving me hope that I will recover this structure. But I still couldn't figure out how to do it. I'm still with R 3.2.

#today, Dec 09 2016
> biocValid()

* sessionInfo()

R version 3.2.3 (2015-12-10)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04 LTS

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets
[8] methods   base     

other attached packages:
 [1] BiocInstaller_1.20.3 AnnotationHub_2.2.5  BSgenome_1.38.0     
 [4] rtracklayer_1.30.4   Biostrings_2.38.4    XVector_0.10.0      
 [7] GenomicRanges_1.22.4 GenomeInfoDb_1.6.3   IRanges_2.4.8       
[10] S4Vectors_0.8.11     BiocGenerics_0.16.1  RSQLite_1.0.0       
[13] DBI_0.5-1           

update with biocLite()

Error: 20 package(s) out of date

I tried to do the following :

> hub = AnnotationHub()
snapshotDate(): 2016-12-08

> possibleDates(hub)  

# I got a list of 63 dates

>snapshotDate(hub) = "2016-07-20" 

# I tried a few other dates, but I think that should be the right one

> query(hub, "communis")
AnnotationHub with 1 record
# snapshotDate(): 2016-07-20
# names(): AH10581
# $dataprovider: Inparanoid8
# $species: Ricinus communis

#That's the other only available thing on Ricinus communis, and it's not the one I want (AH48041 is the right one). There were 2 records originally, the other one was the OrgDb.

So I could not set AnnotationHub to a different date, and I still have the old versions of R/Bioconductor tha I hoped would make the trick. Please help me!!!

A comment: if you work with humans and have daily updates on data that is all very good, but this structure is the only one available with the correlation ENTREZID, Symbol, GO that I found anywhere for this species. Not nice when it's not available...      Thank you in advance, Ana.

ADD COMMENTlink written 11 months ago by anamabrantesc0

Just fyi, when you respond to an answer or comment use 'ADD COMMENT' instead of posting a new answer.

I am able to get this resource with R-3-2 and snapshot date of "2016-07-20".

> library(AnnotationHub)
> hub <- AnnotationHub()
snapshotDate(): 2016-12-08

Change the date:
> snapshotDate(hub) <- "2016-07-20"
> hub
AnnotationHub with 36164 records
# snapshotDate(): 2016-07-20 
# $dataprovider: BroadInstitute, UCSC, Ensembl, ftp://ftp.ncbi.nlm.nih.gov/g...
# $species: Homo sapiens, Mus musculus, Bos taurus, Pan troglodytes, Danio r...
# $rdataclass: GRanges, BigWigFile, FaFile, ChainFile, OrgDb, TwoBitFile, In...
# additional mcols(): taxonomyid, genome, description, tags, sourceurl,

...

Extract AH48041: 

> hub["AH48041"]
AnnotationHub with 1 record
# snapshotDate(): 2016-07-20 
# names(): AH48041
# $dataprovider: ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/
# $species: Ricinus communis
# $rdataclass: OrgDb
# $title: org.Ricinus_communis.eg.sqlite
# $description: NCBI gene ID based annotations about Ricinus_communis
# $taxonomyid: 3988
# $genome: NCBI genomes
# $sourcetype: NCBI/UniProt
# $sourceurl: ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/, ftp://ftp.uniprot.org/p...
# $sourcelastmodifieddate: NA
# $sourcesize: NA
# $tags: NCBI, Gene, Annotation 
# retrieve record with 'object[["AH48041"]]' 

> sessionInfo()
R version 3.2.5 Patched (2016-05-05 r71773)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Fedora 24 (Workstation Edition)

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods  
[8] base     

other attached packages:
[1] AnnotationHub_2.2.5 BiocGenerics_0.16.1

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.8                  IRanges_2.4.8               
 [3] digest_0.6.10                mime_0.5                    
 [5] R6_2.2.0                     xtable_1.8-2                
 [7] DBI_0.5-1                    stats4_3.2.5                
 [9] RSQLite_1.1-1                BiocInstaller_1.20.3        
[11] httr_1.2.1                   curl_2.3                    
[13] S4Vectors_0.8.11             Biobase_2.30.0              
[15] shiny_0.14.2                 httpuv_1.3.3                
[17] AnnotationDbi_1.32.3         memoise_1.0.0               
[19] htmltools_0.3.5              interactiveDisplayBase_1.8.0

ADD REPLYlink written 11 months ago by Valerie Obenchain ♦♦ 6.4k

Thank you so much Valerie, I managed to find the object after all after changing the snapshot date:

> library(AnnotationHub)

>ah = AnnotationHub()

>snapshotDate(ah) = "2016-07-20"

>ah["AH48041"]

>obj = ah[["AH48041"]]

OK, it worked, problem solved, accolades to you!   Ana

ADD REPLYlink written 11 months ago by anamabrantesc0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 315 users visited in the last hour