CRISPR SEEK TxDb issue
2
0
Entering edit mode
XIN1988 • 0
@xin1988-22335
Last seen 4.0 years ago

Hi everyone, I tried to use CRISPR seek to design gRNA for pig gene. Below error occurred. However, there are no txdb for pig genome existing. How to figure it out?

Error in offTargetAnalysis(inputFilePath2, findgRNAsWithREcutOnly = FALSE,  : 
  To indicate whether an offtarget is inside an exon, txdb is
            required as TxDb object!
In addition: Warning message:
In dir.create(outputDir) :
  'D:\RNA seq analysis R\CRISPRseek\Pig Chrom1 gRNA' already exists
> 
results_Pig2 <- offTargetAnalysis(inputFilePath2, findgRNAsWithREcutOnly = FALSE,
                                 REpatternFile = REpatternFile,findPairedgRNAOnly = FALSE, 
                                 BSgenomeName = BSgenomeName,chromToSearch= "all", 
                                 orgAnn = orgAnn, 
                                 max.mismatch = 1, outputDir = outputDir, overwrite = TRUE)
annotation • 846 views
ADD COMMENT
1
Entering edit mode
@james-w-macdonald-5106
Last seen 4 hours ago
United States
> library(AnnotationHub)
> hub <- AnnotationHub()
> z <- query(hub, c("sus scrofa","txdb"))
> z
AnnotationHub with 9 records
# snapshotDate(): 2019-10-29 
# $dataprovider: UCSC
# $species: Sus scrofa
# $rdataclass: TxDb
# additional mcols(): taxonomyid, genome, description,
#   coordinate_1_based, maintainer, rdatadateadded, preparerclass, tags,
#   rdatapath, sourceurl, sourcetype 
# retrieve records with, e.g., 'object[["AH52273"]]' 

            title                                    
  AH52273 | TxDb.Sscrofa.UCSC.susScr3.refGene.sqlite 
  AH61788 | TxDb.Sscrofa.UCSC.susScr11.refGene.sqlite
  AH61800 | TxDb.Sscrofa.UCSC.susScr3.refGene.sqlite 
  AH66180 | TxDb.Sscrofa.UCSC.susScr11.refGene.sqlite
  AH66181 | TxDb.Sscrofa.UCSC.susScr3.refGene.sqlite 
  AH70599 | TxDb.Sscrofa.UCSC.susScr11.refGene.sqlite
  AH70600 | TxDb.Sscrofa.UCSC.susScr3.refGene.sqlite 
  AH75767 | TxDb.Sscrofa.UCSC.susScr11.refGene.sqlite
  AH75768 | TxDb.Sscrofa.UCSC.susScr3.refGene.sqlite 
> mcols(z)$rdatadateadded
[1] "2016-12-22" "2018-04-19" "2018-04-19" "2018-10-22" "2018-10-22"
[6] "2019-05-01" "2019-05-01" "2019-10-29" "2019-10-29"
> txdb <- hub[["AH75767"]]
downloading 1 resources
retrieving 1 resource
  |======================================================================| 100%

> txdb
TxDb object:
# Db type: TxDb
# Supporting package: GenomicFeatures
# Data source: UCSC
# Genome: susScr11
# Organism: Sus scrofa
# Taxonomy ID: 9823
# UCSC Table: refGene
# UCSC Track: RefSeq Genes
# Resource URL: http://genome.ucsc.edu/
# Type of Gene ID: Entrez Gene ID
# Full dataset: yes
# miRBase build ID: NA
# transcript_nrow: 4599
# exon_nrow: 36486
# cds_nrow: 34595
# Db created by: GenomicFeatures package from Bioconductor
# Creation time: 2019-10-21 20:54:08 +0000 (Mon, 21 Oct 2019)
# GenomicFeatures version at creation time: 1.37.4
# RSQLite version at creation time: 2.1.2
# DBSCHEMAVERSION: 1.2

ADD COMMENT
1
Entering edit mode

Also worth mentioning the makeTxDbFrom*() functions from the GenomicFeatures package.

For example makeTxDbFromUCSC() lets you choose your table/track:

library(GenomicFeatures)
supportedUCSCtables("susScr11")
#     tablename         track subtrack
# 1     refGene  RefSeq Genes     <NA>
# 2 xenoRefGene  Other RefSeq     <NA>
# 3     ensGene Ensembl Genes     <NA>
# 4     genscan Genscan Genes     <NA>

txdb <- makeTxDbFromUCSC("susScr11", "ensGene")
txdb
# TxDb object:
# Db type: TxDb
# Supporting package: GenomicFeatures
# Data source: UCSC
# Genome: susScr11
# Organism: Sus scrofa
# Taxonomy ID: 9823
# UCSC Table: ensGene
# UCSC Track: Ensembl Genes
# Resource URL: http://genome.ucsc.edu/
# Type of Gene ID: Ensembl gene ID
# Full dataset: yes
# miRBase build ID: NA
# transcript_nrow: 49448
# exon_nrow: 262635
# cds_nrow: 219938
# Db created by: GenomicFeatures package from Bioconductor
# Creation time: 2019-11-13 15:03:05 -0800 (Wed, 13 Nov 2019)
# GenomicFeatures version at creation time: 1.38.0
# RSQLite version at creation time: 2.1.2
# DBSCHEMAVERSION: 1.2

See ?makeTxDbFromUCSC for more information.

Or use makeTxDbFromEnsembl() if you prefer to fetch the gene models directly from Ensembl:

txdb <- makeTxDbFromEnsembl("Sus scrofa")
txdb
# TxDb object:
# Db type: TxDb
# Supporting package: GenomicFeatures
# Data source: Ensembl
# Organism: Sus scrofa
# Ensembl release: 98
# Ensembl database: sus_scrofa_core_98_111
# MySQL server: ensembldb.ensembl.org
# transcript_nrow: 63041
# exon_nrow: 319126
# cds_nrow: 550324
# Db created by: GenomicFeatures package from Bioconductor
# Creation time: 2019-11-13 15:09:34 -0800 (Wed, 13 Nov 2019)
# GenomicFeatures version at creation time: 1.38.0
# RSQLite version at creation time: 2.1.2
# DBSCHEMAVERSION: 1.2

See ?makeTxDbFromEnsembl for more information.

Finally note that AnnotationHub hosts a few EnsDb objects which are functionally equivalent to TxDb objects with some additional capabilities:

library(AnnotationHub)
hub <- AnnotationHub()
z <- query(hub, c("sus scrofa", "ensdb"))
z
# AnnotationHub with 12 records
## snapshotDate(): 2019-10-29 
## $dataprovider: Ensembl
## $species: Sus scrofa
## $rdataclass: EnsDb
## additional mcols(): taxonomyid, genome, description,
##   coordinate_1_based, maintainer, rdatadateadded, preparerclass, tags,
##   rdatapath, sourceurl, sourcetype 
## retrieve records with, e.g., 'object[["AH53243"]]' 
#
#            title                          
#  AH53243 | Ensembl 87 EnsDb for Sus Scrofa
#  AH53747 | Ensembl 88 EnsDb for Sus Scrofa
#  AH56713 | Ensembl 89 EnsDb for Sus Scrofa
#  AH57796 | Ensembl 90 EnsDb for Sus Scrofa
#  AH60819 | Ensembl 91 EnsDb for Sus Scrofa
#  ...       ...                            
#  AH64991 | Ensembl 94 EnsDb for Sus scrofa
#  AH68020 | Ensembl 95 EnsDb for Sus scrofa
#  AH69274 | Ensembl 96 EnsDb for Sus scrofa
#  AH73969 | Ensembl 97 EnsDb for Sus scrofa
#  AH75101 | Ensembl 98 EnsDb for Sus scrofa

ensdb <- hub[["AH75101"]]
ensdb
# EnsDb for Ensembl:
# |Backend: SQLite
# |Db type: EnsDb
# |Type of Gene ID: Ensembl Gene ID
# |Supporting package: ensembldb
# |Db created by: ensembldb package from Bioconductor
# |script_version: 0.3.4
# |Creation time: Mon Sep 30 23:25:30 2019
# |ensembl_version: 98
# |ensembl_host: localhost
# |Organism: Sus scrofa
# |taxonomy_id: 9823
# |genome_build: Sscrofa11.1
# |DBSCHEMAVERSION: 2.1
# | No. of genes: 31907.
# | No. of transcripts: 63041.
# |Protein data available.

See documentation in the ensembldb package for more information.

Best,

H.

ADD REPLY
0
Entering edit mode

Thank you very much, James!

ADD REPLY
0
Entering edit mode
Julie Zhu ★ 4.3k
@julie-zhu-3596
Last seen 4 months ago
United States

Hi Xin,

Please follow the script from Hervé or James to create the txdb for the pig genome, then set the parameter txdb and orgAnn in the offtargetAnalysis function call, e.g., txdb = txdb.susScr11.UCSC, orgAnn = org.Ss.egSYMBOL after you run the following script. About the warning message, it should go away if you set overwrite = TRUE in the offtargetAnalysis function call.

library(GenomicFeatures)

txdb.susScr11.UCSC <- makeTxDbFromUCSC("susScr11", "ensGene")

if (!requireNamespace("BiocManager", quietly = TRUE)) install.packages("BiocManager")

BiocManager::install("org.Ss.eg.db")

library("org.Ss.eg.db")

Please try to follow the vignettes at https://bioconductor.org/packages/release/bioc/vignettes/CRISPRseek/inst/doc/CRISPRseek.pdf.

Best regards, Julie

ADD COMMENT
0
Entering edit mode

Hi Julie, Thank you very much for your quick response. The problem has been resolved by following the script from James and Hervé. I will try several sequences following the vignettes for the design. Thanks again, Xin

ADD REPLY
0
Entering edit mode

Hi Xin,

Great to hear that! Thanks for letting me know!

Best wishes, Julie

ADD REPLY

Login before adding your answer.

Traffic: 748 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6