Installing annotation package in locally
2
0
Entering edit mode
KABILAN • 0
@e750450e
Last seen 4 hours ago
India

I have an annotation package, and I am not able to install the package locally. Since the package publishing takes more time, I want to use the package in locally. The annotation files were uploaded to cloud storage already. I am getting the below error,

error details

and this is the R script of the package

datacache <- new.env(hash=TRUE, parent=emptyenv())

org.Hbacteriophora.eg <- function() showQCData("org.Hbacteriophora.eg", datacache)
org.Hbacteriophora.eg_dbconn <- function() dbconn(datacache)
org.Hbacteriophora.eg_dbfile <- function() dbfile(datacache)
org.Hbacteriophora.eg_dbschema <- function(file="", show.indices=FALSE) dbschema(datacache, file=file, show.indices=show.indices)
org.Hbacteriophora.eg_dbInfo <- function() dbInfo(datacache)

org.Hbacteriophora.egORGANISM <- "Heterorhabditis bacteriophora"

.onLoad <- function(libname, pkgname) {
    ## Load AnnotationHub
    hub <- AnnotationHub:::AnnotationHub()

    ## Query for the specific database (organism & sqlite)
    query_result <- AnnotationHub:::query(hub, c("Heterorhabditis", "bacteriophora", "sqlite"))

    ## Assuming the SQLite file is the first query result
    sqliteFile <- query_result[[1]]

    ## Assign the database file path and connection
    dbfile <- sqliteFile$path
    assign("dbfile", dbfile, envir=datacache)
    dbconn <- dbFileConnect(dbfile)
    assign("dbconn", dbconn, envir=datacache)

    ## Create the OrgDb object from the AnnotationHub resource
    sPkgname <- sub(".db$", "", pkgname)
    db <- loadDb(dbfile, packageName = pkgname)
    dbNewname <- AnnotationDbi:::dbObjectName(pkgname, "OrgDb")
    ns <- asNamespace(pkgname)
    assign(dbNewname, db, envir=ns)
    namespaceExport(ns, dbNewname)

    packageStartupMessage(AnnotationDbi:::annoStartupMessages("org.Hbacteriophora.eg.db"))
}

.onUnload <- function(libpath) {
    dbFileDisconnect(org.Hbacteriophora.eg_dbconn())
}

Please help me to solve this error.

AnnotationDbi AnnotationHub Annotation AnnotationForge • 715 views
ADD COMMENT
1
Entering edit mode
@james-w-macdonald-5106
Last seen 1 hour ago
United States

That's not how to save an OrgDb you got from AnnotationHub. The usual way of doing that is to download it the one time, and then it will be cached and you can just reload it later. Something like

hub <- AnnotationHub()
z <- hub[["AH119196"]]

And then later doing that again will just load the cached version. But if you really think you need to save it, you can just use saveDb

saveDb(z, "afilename.sqlite")

And then later you can do

library(AnnotationDbi)
z <- loadDb("afilename.sqlite")

And now you can use it as usual.

0
Entering edit mode

Thank you for your suggestion. But I am getting the same error message while running this script.

enter image description here

ADD REPLY
0
Entering edit mode

OK, that appears to be a busted resource. Lori will have to look into it.

1
Entering edit mode

If that's the case the resource came from@KABILAN so they'll have to regenerate.

ADD REPLY
0
Entering edit mode

It means I have to upload all the files again. Isn't it?

ADD REPLY
1
Entering edit mode

Yes It appears to be an issue with the file.

ADD REPLY
1
Entering edit mode

No, you have to fix the package first. You are apparently trying to load a SQLite DB as if it were a serialized R object. That's not how OrgDb packages work. They are essentially a SQLite DB and some wrapper functions that perform SQL queries in order to extract data. Your code is trying to use load to bring the SQLite DB into memory, which isn't a thing. In other words, consider this:

> load("c:/Users/jmacdon/AppData/Local/R/cache/R/AnnotationHub/28a862ef49b5_125942")
Error in load("c:/Users/jmacdon/AppData/Local/R/cache/R/AnnotationHub/28a862ef49b5_125942") : 
  bad restore file magic number (file may be corrupted) -- no data loaded
In addition: Warning message:
file '28a862ef49b5_125942' has magic number 'SQLit'
  Use of save versions prior to 2 is deprecated

That file (28a862ef49b5_125942) is an SQLite DB in my hubCache, and is meant to be queried using SQL queries:

> library(RSQLite)
> con2 <- dbConnect(SQLite(), "c:/Users/jmacdon/AppData/Local/R/cache/R/AnnotationHub/28a862ef49b5_125942")
> dbListTables(con2)
 [1] "accessions"   "alias"        "chromosomes"  "entrez_genes" "gene_info"    "genes"        "go"          
 [8] "go_all"       "go_bp"        "go_bp_all"    "go_cc"        "go_cc_all"    "go_mf"        "go_mf_all"   
[15] "map_counts"   "map_metadata" "metadata"     "refseq"

And if I do the same with an existing OrgDb, you can see it's very similar:

> library(org.Hs.eg.db)
> con <- org.Hs.eg_dbconn()
> dbListTables(con)
 [1] "accessions"            "alias"                 "chrlengths"            "chromosome_locations" 
 [5] "chromosomes"           "cytogenetic_locations" "ec"                    "ensembl"              
 [9] "ensembl2ncbi"          "ensembl_prot"          "ensembl_trans"         "gene_info"            
[13] "genes"                 "genetype"              "go"                    "go_all"               
[17] "go_bp"                 "go_bp_all"             "go_cc"                 "go_cc_all"            
[21] "go_mf"                 "go_mf_all"             "kegg"                  "map_counts"           
[25] "map_metadata"          "metadata"              "ncbi2ensembl"          "omim"                 
[29] "pfam"                  "prosite"               "pubmed"                "refseq"               
[33] "sqlite_stat1"          "sqlite_stat4"          "ucsc"                  "uniprot"

I tried to read the vignette for HubPub, but it is rather impenetrable IMO. The gist appears to be that you generate a regular package, put the SQLite DB file in the cloud or another publicly available place, and add a bit of extra stuff to the package to say where the DB is? Ideally you would be able to get another OrgDb from AnnotationHub and just emulate what's in it, but I don't know how that is done?

ADD REPLY
0
Entering edit mode
KABILAN • 0
@e750450e
Last seen 4 hours ago
India

Thank you for your suggestions. I have tried as @James W. MacDonald suggested, and I got the same result. However, I couldn't get the complete details present in the annotation package. I am getting the results of only few genes. Maybe the problem with my package only. I will again upload the files to cloud storage and try again.

Once again thank you for all your suggestions.

ADD COMMENT
0
Entering edit mode

I assume you are generating this package using makeOrgPackageFromNCBI? If so, there are only 24 genes at NCBI for that species, so it's not likely worth the effort. But maybe you have a different source. You could hypothetically use the GI number instead, but in that case you would have to use makeOrgPackage with a set of data.frames that link the GI to other IDs.

There may be ways to download the data you need directly, but I would probably use EUtils directly, although that isn't a simple thing either.

ADD REPLY
0
Entering edit mode

Oh, right:

> library(RSQLite)
> con2 <- dbConnect(SQLite(), "c:/Users/jmacdon/AppData/Local/R/cache/R/AnnotationHub/28a862ef49b5_125942")
> dbListTables(con2)
 [1] "accessions"   "alias"       
 [3] "chromosomes"  "entrez_genes"
 [5] "gene_info"    "genes"       
 [7] "go"           "go_all"      
 [9] "go_bp"        "go_bp_all"   
[11] "go_cc"        "go_cc_all"   
[13] "go_mf"        "go_mf_all"   
[15] "map_counts"   "map_metadata"
[17] "metadata"     "refseq"      
> dbGetQuery(con2, "select count(*) from genes;")
  count(*)
1       12

Even worse than I thought. Annotating 12 genes isn't going to be useful. You might try the nuccore table and use GI instead, but it's up to you if you want to spend the time.

ADD REPLY

Login before adding your answer.

Traffic: 452 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6