Question

Installing annotation package in locally

0

Entering edit mode

KABILAN • 0

@e750450e

Last seen 22 days ago

India

I have an annotation package, and I am not able to install the package locally. Since the package publishing takes more time, I want to use the package in locally. The annotation files were uploaded to cloud storage already. I am getting the below error,

error details

and this is the R script of the package

datacache <- new.env(hash=TRUE, parent=emptyenv())

org.Hbacteriophora.eg <- function() showQCData("org.Hbacteriophora.eg", datacache)
org.Hbacteriophora.eg_dbconn <- function() dbconn(datacache)
org.Hbacteriophora.eg_dbfile <- function() dbfile(datacache)
org.Hbacteriophora.eg_dbschema <- function(file="", show.indices=FALSE) dbschema(datacache, file=file, show.indices=show.indices)
org.Hbacteriophora.eg_dbInfo <- function() dbInfo(datacache)

org.Hbacteriophora.egORGANISM <- "Heterorhabditis bacteriophora"

.onLoad <- function(libname, pkgname) {
    ## Load AnnotationHub
    hub <- AnnotationHub:::AnnotationHub()

    ## Query for the specific database (organism & sqlite)
    query_result <- AnnotationHub:::query(hub, c("Heterorhabditis", "bacteriophora", "sqlite"))

    ## Assuming the SQLite file is the first query result
    sqliteFile <- query_result[[1]]

    ## Assign the database file path and connection
    dbfile <- sqliteFile$path
    assign("dbfile", dbfile, envir=datacache)
    dbconn <- dbFileConnect(dbfile)
    assign("dbconn", dbconn, envir=datacache)

    ## Create the OrgDb object from the AnnotationHub resource
    sPkgname <- sub(".db$", "", pkgname)
    db <- loadDb(dbfile, packageName = pkgname)
    dbNewname <- AnnotationDbi:::dbObjectName(pkgname, "OrgDb")
    ns <- asNamespace(pkgname)
    assign(dbNewname, db, envir=ns)
    namespaceExport(ns, dbNewname)

    packageStartupMessage(AnnotationDbi:::annoStartupMessages("org.Hbacteriophora.eg.db"))
}

.onUnload <- function(libpath) {
    dbFileDisconnect(org.Hbacteriophora.eg_dbconn())
}

Please help me to solve this error.

AnnotationDbi AnnotationHub Annotation AnnotationForge • 1.6k views

ADD COMMENT • link 5 weeks ago KABILAN • 0

score 1 · Answer 1 · 2024-10-21

1

Entering edit mode

James W. MacDonald 67k

@james-w-macdonald-5106

Last seen 1 day ago

United States

That's not how to save an OrgDb you got from AnnotationHub. The usual way of doing that is to download it the one time, and then it will be cached and you can just reload it later. Something like

hub <- AnnotationHub()
z <- hub[["AH119196"]]

And then later doing that again will just load the cached version. But if you really think you need to save it, you can just use saveDb

saveDb(z, "afilename.sqlite")

And then later you can do

library(AnnotationDbi)
z <- loadDb("afilename.sqlite")

And now you can use it as usual.

ADD COMMENT • link 5 weeks ago James W. MacDonald 67k

0

Entering edit mode

Thank you for your suggestion. But I am getting the same error message while running this script.

enter image description here

ADD REPLY • link 5 weeks ago KABILAN • 0

0

Entering edit mode

OK, that appears to be a busted resource. Lori will have to look into it.

ADD REPLY • link 5 weeks ago James W. MacDonald 67k

1

Entering edit mode

If that's the case the resource came from@KABILAN so they'll have to regenerate.

ADD REPLY • link 5 weeks ago shepherl 4.1k

0

Entering edit mode

It means I have to upload all the files again. Isn't it?

ADD REPLY • link 5 weeks ago KABILAN • 0

1

Entering edit mode

Yes It appears to be an issue with the file.

ADD REPLY • link 5 weeks ago shepherl 4.1k

1

Entering edit mode

No, you have to fix the package first. You are apparently trying to load a SQLite DB as if it were a serialized R object. That's not how OrgDb packages work. They are essentially a SQLite DB and some wrapper functions that perform SQL queries in order to extract data. Your code is trying to use load to bring the SQLite DB into memory, which isn't a thing. In other words, consider this:

> load("c:/Users/jmacdon/AppData/Local/R/cache/R/AnnotationHub/28a862ef49b5_125942")
Error in load("c:/Users/jmacdon/AppData/Local/R/cache/R/AnnotationHub/28a862ef49b5_125942") : 
  bad restore file magic number (file may be corrupted) -- no data loaded
In addition: Warning message:
file '28a862ef49b5_125942' has magic number 'SQLit'
  Use of save versions prior to 2 is deprecated

That file (28a862ef49b5_125942) is an SQLite DB in my hubCache, and is meant to be queried using SQL queries:

> library(RSQLite)
> con2 <- dbConnect(SQLite(), "c:/Users/jmacdon/AppData/Local/R/cache/R/AnnotationHub/28a862ef49b5_125942")
> dbListTables(con2)
 [1] "accessions"   "alias"        "chromosomes"  "entrez_genes" "gene_info"    "genes"        "go"          
 [8] "go_all"       "go_bp"        "go_bp_all"    "go_cc"        "go_cc_all"    "go_mf"        "go_mf_all"   
[15] "map_counts"   "map_metadata" "metadata"     "refseq"

And if I do the same with an existing OrgDb, you can see it's very similar:

> library(org.Hs.eg.db)
> con <- org.Hs.eg_dbconn()
> dbListTables(con)
 [1] "accessions"            "alias"                 "chrlengths"            "chromosome_locations" 
 [5] "chromosomes"           "cytogenetic_locations" "ec"                    "ensembl"              
 [9] "ensembl2ncbi"          "ensembl_prot"          "ensembl_trans"         "gene_info"            
[13] "genes"                 "genetype"              "go"                    "go_all"               
[17] "go_bp"                 "go_bp_all"             "go_cc"                 "go_cc_all"            
[21] "go_mf"                 "go_mf_all"             "kegg"                  "map_counts"           
[25] "map_metadata"          "metadata"              "ncbi2ensembl"          "omim"                 
[29] "pfam"                  "prosite"               "pubmed"                "refseq"               
[33] "sqlite_stat1"          "sqlite_stat4"          "ucsc"                  "uniprot"

I tried to read the vignette for HubPub, but it is rather impenetrable IMO. The gist appears to be that you generate a regular package, put the SQLite DB file in the cloud or another publicly available place, and add a bit of extra stuff to the package to say where the DB is? Ideally you would be able to get another OrgDb from AnnotationHub and just emulate what's in it, but I don't know how that is done?

ADD REPLY • link 5 weeks ago James W. MacDonald 67k

score 0 · Answer 2 · 2024-10-22

0

Entering edit mode

KABILAN • 0

@e750450e

Last seen 22 days ago

India

Thank you for your suggestions. I have tried as @James W. MacDonald suggested, and I got the same result. However, I couldn't get the complete details present in the annotation package. I am getting the results of only few genes. Maybe the problem with my package only. I will again upload the files to cloud storage and try again.

Once again thank you for all your suggestions.

ADD COMMENT • link 5 weeks ago KABILAN • 0

0

Entering edit mode

I assume you are generating this package using makeOrgPackageFromNCBI? If so, there are only 24 genes at NCBI for that species, so it's not likely worth the effort. But maybe you have a different source. You could hypothetically use the GI number instead, but in that case you would have to use makeOrgPackage with a set of data.frames that link the GI to other IDs.

There may be ways to download the data you need directly, but I would probably use EUtils directly, although that isn't a simple thing either.

ADD REPLY • link 5 weeks ago James W. MacDonald 67k

0

Entering edit mode

Oh, right:

> library(RSQLite)
> con2 <- dbConnect(SQLite(), "c:/Users/jmacdon/AppData/Local/R/cache/R/AnnotationHub/28a862ef49b5_125942")
> dbListTables(con2)
 [1] "accessions"   "alias"       
 [3] "chromosomes"  "entrez_genes"
 [5] "gene_info"    "genes"       
 [7] "go"           "go_all"      
 [9] "go_bp"        "go_bp_all"   
[11] "go_cc"        "go_cc_all"   
[13] "go_mf"        "go_mf_all"   
[15] "map_counts"   "map_metadata"
[17] "metadata"     "refseq"      
> dbGetQuery(con2, "select count(*) from genes;")
  count(*)
1       12

Even worse than I thought. Annotating 12 genes isn't going to be useful. You might try the nuccore table and use GI instead, but it's up to you if you want to spend the time.

ADD REPLY • link 5 weeks ago James W. MacDonald 67k

0

Entering edit mode

Yes, you are right. I used makeOrgPackageFromNCBI for creating this package. Though I have my own annotation file in excel format for the Heterorhabditis bacteriophora organism, I don't know how to convert it into an annotation package. So that I can use it for further downstream analysis such as GO enrichment analysis, gene set enrichment analysis, and so on using the available packages like ClusterProfiler, gProfiler... So that I tried makeOrgPackageFromNCBI for developing an annotation package. Now I understood it is not going to help for my analysis. Can you give me some idea for making an annotation package using our own annotation file which is in excel format?

ADD REPLY • link 5 weeks ago KABILAN • 0

1

Entering edit mode

https://bioconductor.org/packages/release/bioc/vignettes/AnnotationForge/inst/doc/MakingNewOrganismPackages.html