Hi,
I have a question about “CFDscore" calculations used in CRISPRseek. I have a guide that I am testing against target DNA sequence using compare2Sequences(). I see some differences when I compare a table manually to the calculated score using compare2Sequences(). Just to be sure how “CFDscore" is implemented and before I go to details with my sequence, I would appreciate answers to the below questions
Questions:
1) Is “CFDscore” in CRISPRseek is calculated based on the table “NatureBiot2016SuppTable19DoenchRoot" (Doench et al., 2016) and Percent-Active column?
2) I have a guide:DNA mismatch (only one mismatch) in position 1, rA:dG, Percent-Active columns gives a score: 0.857. Does it mean this number is used as a score or there is another parameter using this score and recalculating the final CFDscore?
2) If there is a guide:DNA mismatch that doesn’t exist in the table i.e. rU:dA how the score is calculated?
Thanks,
Dawid
Hi Julie,
I was testing non-targeting guides for human. I'm interested in to compare different scoring methods and how they score with guides that shouldn't target i.e. human genome, example here:
When I used max.mismatch = 3, I got a below error (it works with 4 mismatches). Wouldn't be better to have this result in Summary.xls file and include below (i.e. NA or 0), instead as it is now not showing those guides and giving an error?
Julie,
Thanks for your reply, I think that could be also applied when you have a targeting guide but it doesn't have off-targets (at least based on the scoring system).
Dawid
Hi Julie,
I used max.mismatch = 3 and in Summary.xls guide (>neg01) is skipped, even after using CRISPRseek_1.23.1.
Dawid
Dawid,
The summary file generated from running the new version 1.23.2 (git clone git.bioconductor.org:packages/CRISPRseek) should include both types of guides mappable or not mappable to a given genome. Could you please try the following code snippets with the test input sequences? Thanks!
library(CRISPRseek)
library("BSgenome.Hsapiens.UCSC.hg19")
library(TxDb.Hsapiens.UCSC.hg19.knownGene)
library(org.Hs.eg.db)
outputDir <- getwd()
inputFilePath <- "~/DropboxUmass/Bioconductor/Trunk/test.fa"
results <- offTargetAnalysis(inputFilePath, findgRNAsWithREcutOnly = FALSE,
findPairedgRNAOnly = FALSE,
annotatePaired = FALSE,
BSgenomeName = Hsapiens, chromToSearch = "chrX",
txdb = TxDb.Hsapiens.UCSC.hg19.knownGene,
orgAnn = org.Hs.egSYMBOL, max.mismatch = 0,
outputDir = outputDir, overwrite = TRUE)
#### test.fa
>Hsap_GATA1_ex2
ccagtttgtggatcctgctctggtgtcctccacaccagaatcaggg
>test
GTAGCGAACGTGTCCGGCGTAGG
Thanks Julie, I tested quickly, interestingly when I have 1x mappable guide and 1x unmappable guide it seems to work (summary file has both), but when I have 2x mappable guides and 1x unmappable it seems to skip unmappable again in summary file. I run these guides below:
>neg01_unmappable
GTAGCGAACGTGTCCGGCGT
>mappable_1
CAGAGTCTCCTATGCCACAC
>mappable_2
GAAGATGGGCGGGAGTCTTC
Hi Julie,
I tested again, it seems to give me this error below when I test 2x mappable guides and 1x unmappable guide (Everything works when there is 1x mappable guide and 1x unmappable)
# code offTargetAnalysis(inputFilePath, REpatternFile = REpatternFile, scoring.method = "CFDscore", min.score = 0,format = "fasta", findgRNAs = FALSE, findgRNAsWithREcutOnly = FALSE, findPairedgRNAOnly = FALSE, gRNA.name.prefix = "g.", orgAnn = orgAnn, BSgenomeName = BSgenomeName,txdb = txdb,annotateExon = TRUE,chromToSearch= "all", min.gap = 0, max.gap = 20,max.mismatch = 3,topN = 100, topN.OfftargetTotalScore= 10,fetchSequence = TRUE, upstream = 250, downstream = 250, overlap.gRNA.positions = c(17, 18), PAM.size = 3,PAM = "NGG", gRNA.size = 20, outputDir = outputDir,overwrite = TRUE) # d> sessionInfo() R version 3.5.1 (2018-07-02) Platform: x86_64-apple-darwin15.6.0 (64-bit) Running under: macOS 10.14.1 attached base packages: [1] stats4 parallel stats graphics grDevices utils datasets methods base other attached packages: [1] CRISPRseek_1.23.2 BiocInstaller_1.32.1 usethis_1.4.0 devtools_2.0.1 Hmisc_4.1-1 [6] ggplot2_3.1.0 Formula_1.2-3 survival_2.43-3 lattice_0.20-38 org.Hs.eg.db_3.7.0 [11] TxDb.Hsapiens.UCSC.hg19.knownGene_3.2.2 BSgenome.Hsapiens.UCSC.hg19_1.4.0 org.Mm.eg.db_3.7.0 TxDb.Mmusculus.UCSC.mm10.knownGene_3.4.4 GenomicFeatures_1.34.1 [16] AnnotationDbi_1.44.0 Biobase_2.42.0 BSgenome.Mmusculus.UCSC.mm10_1.4.0 seqinr_3.4-5 BSgenome_1.50.0 [21] rtracklayer_1.42.1 Biostrings_2.50.1 XVector_0.22.0 GenomicRanges_1.34.0 GenomeInfoDb_1.18.1 [26] IRanges_2.16.0 S4Vectors_0.20.1 BiocGenerics_0.28.0