I'm trying to design a CRISPR gRNA and search for off-targets with CRISPRseek.
My organism has a very rough draft genome that is organized into ~160,000 scaffolds. I've been able to forge a BSgenome package where I placed a single scaffold into seqnames and all of the scaffolds into the mseqnames category in the seed file. This works fine for creating the BSgenome package. When I run offTargetAnalysis()
though, it seems that CRISPRseek only deals with the single BSgenome seq sequence and ignores the scaffolds under mseq.
I'll post the code and error produced from the offTargetAnalysis()
along with session info below.
I'm wondering if there is a way to:
- get
offTargetAnalysis()
to not ignore these mseq sequences, - make a BSgenome object with 160,000 seqnames in the seedfile, or
- perhaps run
offTargetAnalysis()
on a DNAStringSet object and bypass the BSgenome step entirely?
Thanks for any help,
Alex Rajewski
> offTargetAnalysis("~/Downloads/Target.fa", findgRNAs = TRUE, findgRNAsWithREcutOnly = FALSE, enable.multicore = TRUE, BSgenomeName = BSgenome.Ntom.SolGenomics.1.0, outputDir = "~/Downloads/CRISPRseek", overwrite = TRUE, annotateExon = FALSE) Validating input ... Searching for gRNAs ... >>> Finding all hits in sequence Ntom_ASAG01000001.1.fa ... >>> DONE searching Building feature vectors for scoring ... Error in buildFeatureVectorForScoring(hits = hits, canonical.PAM = PAM, : Empty hits! In addition: Warning message: In searchHits(gRNAs = gRNAs, PAM = PAM.pattern, BSgenomeName = BSgenomeName, : No matching found, please check your input sequence, and make sure you are using the right genome. You can also alter your search criteria such as increasing max.mismatch!
> sessionInfo() R version 3.2.4 (2016-03-10) Platform: x86_64-apple-darwin13.4.0 (64-bit) Running under: OS X 10.11.5 (El Capitan) locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] stats4 parallel stats graphics grDevices utils datasets methods [9] base other attached packages: [1] CRISPRseek_1.10.3 BSgenome.Ntom.SolGenomics.1.0_1.0 [3] BSgenome_1.38.0 rtracklayer_1.30.4 [5] Biostrings_2.38.4 XVector_0.10.0 [7] GenomicRanges_1.22.4 GenomeInfoDb_1.6.3 [9] IRanges_2.4.8 S4Vectors_0.8.11 [11] BiocGenerics_0.16.1 loaded via a namespace (and not attached): [1] zlibbioc_1.16.0 GenomicAlignments_1.6.3 BiocParallel_1.4.3 [4] tools_3.2.4 SummarizedExperiment_1.0.2 data.table_1.9.6 [7] Biobase_2.30.0 lambda.r_1.1.7 futile.logger_1.4.1 [10] seqinr_3.1-5 ade4_1.7-4 futile.options_1.0.0 [13] bitops_1.0-6 RCurl_1.95-4.8 BiocInstaller_1.20.3 [16] Rsamtools_1.22.0 XML_3.98-1.4 chron_2.3-47