When you make a TxDb
that way, it only exists in your R workspace and will disappear when you close R. So if you only need it once, you can make it, use the TxDb
object that you have created to do whatever, and then when you close R it's gone.
If you want it to be persistent, you have to save it somehow and then load it back into your workspace when you need it again. You could hypothetically make a TxDb
package, but that is likely more work than it's worth. You can instead just save the object and load it back in directly. One way to do that is what Lori Shepherd suggested.
An alternative to what Lori suggested is saveDb
and loadDb
from AnnotationDbi
. An example, using an existing TxDb
:
> library(AnnotationDbi)
> library(AnnotationHub)
> hub <- AnnotationHub()
## some random species I know will have a TxDb
> query(hub, c("mulatta","txdb"))
AnnotationHub with 15 records
# snapshotDate(): 2020-04-27
# $dataprovider: UCSC
# $species: Macaca mulatta
# $rdataclass: TxDb
# additional mcols(): taxonomyid, genome, description,
# coordinate_1_based, maintainer, rdatadateadded, preparerclass, tags,
# rdatapath, sourceurl, sourcetype
# retrieve records with, e.g., 'object[["AH52261"]]'
title
AH52261 | TxDb.Mmulatta.UCSC.rheMac3.refGene.sqlite
AH52262 | TxDb.Mmulatta.UCSC.rheMac8.refGene.sqlite
AH57989 | TxDb.Mmulatta.UCSC.rheMac3.refGene.sqlite
AH57990 | TxDb.Mmulatta.UCSC.rheMac8.refGene.sqlite
AH61795 | TxDb.Mmulatta.UCSC.rheMac3.refGene.sqlite
... ...
AH75759 | TxDb.Mmulatta.UCSC.rheMac3.refGene.sqlite
AH75760 | TxDb.Mmulatta.UCSC.rheMac8.refGene.sqlite
AH75761 | TxDb.Mmulatta.UCSC.rheMac10.refGene.sqlite
AH79593 | TxDb.Mmulatta.UCSC.rheMac3.refGene.sqlite
AH79594 | TxDb.Mmulatta.UCSC.rheMac8.refGene.sqlite
## get the TxDb
> z <- hub[["AH79594"]]
downloading 1 resources
retrieving 1 resource
|======================================================================| 100%
loading from cache
Loading required package: GenomicFeatures
Loading required package: GenomeInfoDb
Loading required package: GenomicRanges
> z
TxDb object:
# Db type: TxDb
# Supporting package: GenomicFeatures
# Data source: UCSC
# Genome: rheMac8
# Organism: Macaca mulatta
# Taxonomy ID: 9544
# UCSC Table: refGene
# UCSC Track: RefSeq Genes
# Resource URL: http://genome.ucsc.edu/
# Type of Gene ID: Entrez Gene ID
# Full dataset: yes
# miRBase build ID: NA
# Nb of transcripts: 6504
# Db created by: GenomicFeatures package from Bioconductor
# Creation time: 2020-04-28 14:21:43 +0000 (Tue, 28 Apr 2020)
# GenomicFeatures version at creation time: 1.39.7
# RSQLite version at creation time: 2.2.0
# DBSCHEMAVERSION: 1.2
## save it
> saveDb(z, "whatevs")
TxDb object:
# Db type: TxDb
# Supporting package: GenomicFeatures
# Data source: UCSC
# Genome: rheMac8
# Organism: Macaca mulatta
# Taxonomy ID: 9544
# UCSC Table: refGene
# UCSC Track: RefSeq Genes
# Resource URL: http://genome.ucsc.edu/
# Type of Gene ID: Entrez Gene ID
# Full dataset: yes
# miRBase build ID: NA
# Nb of transcripts: 6504
# Db created by: GenomicFeatures package from Bioconductor
# Creation time: 2020-04-28 14:21:43 +0000 (Tue, 28 Apr 2020)
# GenomicFeatures version at creation time: 1.39.7
# RSQLite version at creation time: 2.2.0
# DBSCHEMAVERSION: 1.2
## Get rid of it
> rm(z)
> z
Error: object 'z' not found
## and load it back in.
> z <- loadDb("whatevs")
> z
TxDb object:
# Db type: TxDb
# Supporting package: GenomicFeatures
# Data source: UCSC
# Genome: rheMac8
# Organism: Macaca mulatta
# Taxonomy ID: 9544
# UCSC Table: refGene
# UCSC Track: RefSeq Genes
# Resource URL: http://genome.ucsc.edu/
# Type of Gene ID: Entrez Gene ID
# Full dataset: yes
# miRBase build ID: NA
# Nb of transcripts: 6504
# Db created by: GenomicFeatures package from Bioconductor
# Creation time: 2020-04-28 14:21:43 +0000 (Tue, 28 Apr 2020)
# GenomicFeatures version at creation time: 1.39.7
# RSQLite version at creation time: 2.2.0
# DBSCHEMAVERSION: 1.2
Your second question about using the TxDb
is beyond the scope of a support site post. You should read the GenomicFeatures vignette, which is intended to teach you how to use them.
After the code you show, can you see anything with txdb? it should show something similar to if you used the gff_file as an example
If this is true than it generated correctly. You could save for future use by simply using the
save()
function in R to a location of your choice and then usingload()
in future R session.Hi,
Make sure to always use
saveDb
/loadDb
on a TxDb object, notsave
/load
. The latter will break the object. The former will save the object to a self-contained SQLite file that is relocatable.H.