gRNAefficacy calculations in CRISPRseek
3
1
Entering edit mode
@dawid-g-nowak-6790
Last seen 20 months ago
United States

Hello,

I have a question about “gRNAefficacy” in “OfftargetAnalysis.xls” file. I used to use this code below and was getting “gRNAefficacy” column but recently I noticed this column is not there (or contains NA). Guide that I'm testing should not have a perfect match in the genome (it is a negative control). Am I assuming correctly that if your score < 1, “gRNAefficacy” will be “NA”? Previously, even if guide didn't have a perfect match in the genome “gRNAefficacy” column was still calculated. Do I need to include another argument in offTargetAnalysis()?

Thanks, Dawid

offTargetAnalysis(inputFilePath=inputFilePath,
                  gRNAoutputName=outputDir, #enter a name for the gRNA ouput file name when DNAStringSet instead of file path provided
                  outputDir = outputDir,
                  REpatternFile = REpatternFile,
                  scoring.method = "CFDscore",
                  min.score = 0.000001,  
                  format = "fasta",
                  findgRNAs = FALSE, # important for testing guides to set FALSE
                  findgRNAsWithREcutOnly = FALSE, # if FALSE not restr. enzymes
                  findPairedgRNAOnly = FALSE, # paired only turn to FALSE if only WT Cas9 not Nickase
                  gRNA.name.prefix = "g.",                 
                  orgAnn = orgAnn,
                  BSgenomeName = BSgenomeName,
                  txdb = txdb,
                  annotateExon = TRUE,
                  chromToSearch= "all", # change here for all to look at all chromosomes
                  min.gap = 0, max.gap = 20,
                  max.mismatch = 4,
                  topN = 1000,
                  topN.OfftargetTotalScore= 10, # 10 top Offtarget will be calculated 
                  fetchSequence = TRUE, upstream = 250, downstream = 250,
                  overlap.gRNA.positions = c(17, 18), 
                  PAM.size = 3,
                  PAM = "NGG",
                  gRNA.size = 20,
                  overwrite = TRUE)

> sessionInfo() R version 4.0.2 (2020-06-22) Platform: x86_64-apple-darwin17.0 (64-bit) Running under: macOS Catalina 10.15.6

Matrix products: default BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib

locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:  [1] grid      stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:  [1] RankAggreg_0.6.6                         plyr_1.8.6                                forcats_0.5.0                [4] stringr_1.4.0                             dplyr_1.0.2              purrr_0.3.4                                [7] readr_1.3.1             tidyr_1.1.1                               tibble_3.0.3                 [10] tidyverse_1.3.0                           org.Hs.eg.db_3.11.4     TxDb.Hsapiens.UCSC.hg19.knownGene_3.2.2   [13] BSgenome.Hsapiens.UCSC.hg19_1.4.3         org.Mm.eg.db_3.11.4          TxDb.Mmusculus.UCSC.mm10.knownGene_3.10.0 [16] GenomicFeatures_1.40.1  AnnotationDbi_1.50.3                      Biobase_2.48.0               [19] BSgenome.Mmusculus.UCSC.mm10_1.4.0        Hmisc_4.4-1             ggplot2_3.3.2                             [22] Formula_1.2-3           survival_3.2-3                            lattice_0.20-41              [25] readxl_1.3.1                              seqinr_3.6-1            CRISPRseek_1.28.0                         [28] BSgenome_1.56.0         rtracklayer_1.48.0                        Biostrings_2.56.0            [31] XVector_0.28.0                            GenomicRanges_1.40.0    GenomeInfoDb_1.24.2                       [34] IRanges_2.22.2          S4Vectors_0.26.1                          BiocGenerics_0.34.0
crisprseek crispr gRNAefficacy grna • 2.4k views
ADD COMMENT
1
Entering edit mode
Julie Zhu ★ 4.3k
@julie-zhu-3596
Last seen 13 months ago
United States

Hi Dawid,

Your assumption is correct that if your score < 1, “gRNAefficacy” will be “NA". Previously, even if guide didn't have a perfect match in the genome “gRNAefficacy” column was still calculated.

According to the feedback I received, if the score < 1, "gRNAefficacy" is not applicable. To speed up the analysis, I updated the code to set “gRNAefficacy” to “NA" if your score < 1.

Do you think it is important to add another argument in offTargetAnalysis to give users the option to have it calculated or not?

Thanks!

Best regards,

Julie

ADD COMMENT
1
Entering edit mode

Hi Julie,

Thanks for your reply. I understand speed concerns so having an option to control calculating "gRNAefficacy" would be great if you need. I hope it wouldn't take too much to implement.

Below there are points where calculating "gRNAefficacy" could be useful: 1) In my particular case I'm interested in "non-targeting" guides so if two guides would have low, similar off-targets scores then I would choose the one with less “gRNAefficacy” to limit its action. 2) On the top of checking off-target "score", I used to compare “gRNAefficacy” between guides in "OfftargetAnalysis" file even without perfect match and that helped me to avoid guides with higher “gRNAefficacy” in off-targets.

Please let me know what you think. Is there any way to calculate this in the meantime base on the "OfftargetAnalysis" file?

Thanks, Dawid

ADD REPLY
1
Entering edit mode
Julie Zhu ★ 4.3k
@julie-zhu-3596
Last seen 13 months ago
United States

Hi Dawid,

Thanks for sharing your use cases with gRNAefficancy for offtarget sites!

I will add a parameter calculategRNAefficacyForOfftargets, default to TRUE.

Meanwhile, you can use the function calculategRNAEfficiency to get gRNAefficacy for the offtargets. Please set upstream to 4 and downstream to 3 in offTargetAnalysis call and save the output to a variable.

Here is an example.

library(CRISPRseek) library("BSgenome.Hsapiens.UCSC.hg19") library(TxDb.Hsapiens.UCSC.hg19.knownGene) library(org.Hs.eg.db) outputDir <- getwd() inputFilePath <- system.file("extdata", "inputseq.fa", package = "CRISPRseek")

results <- offTargetAnalysis(inputFilePath, findgRNAsWithREcutOnly = FALSE, 
    findPairedgRNAOnly = FALSE, 
        annotatePaired = FALSE,
        BSgenomeName = Hsapiens, chromToSearch = "chrX",
        txdb = TxDb.Hsapiens.UCSC.hg19.knownGene, 
    orgAnn = org.Hs.egSYMBOL, max.mismatch = 1, upstream = 4, downstream = 3,
        outputDir = outputDir, overwrite = TRUE)


featureWeightMatrixFile <- system.file("extdata", "DoenchNBT2014.csv", 
    package = "CRISPRseek")
featureWeightMatrix <- read.csv(featureWeightMatrixFile, header=TRUE)
gRNAefficacy <- calculategRNAEfficiency(results$offtarget$flankSequence, baseBeforegRNA = 4, 
    featureWeightMatrix, gRNA.size = 20)
ADD COMMENT
1
Entering edit mode
Julie Zhu ★ 4.3k
@julie-zhu-3596
Last seen 13 months ago
United States

Hi Dawid,

I have added a parameter called calculategRNAefficacyForOfftargets in offTargetAnalysis. By default, it will output gRNAefficacy for ontargets as well as offtargets.

The updated function is available in CRISPRseek_1.29.1.

Please let me know if it works as expected.

Thanks for your feedback and sharing your use cases! To benefit others, I have added this link to the help menu of offTargetAnalysis function.

Best regards,

Julie

ADD COMMENT
0
Entering edit mode

Hi Julie,

Thank you for this addition it works nicely! I have another question related to this subject.

If I have a guide and I want to test on the sequence of my choice, I did this code:

compare2Sequences(guide, sequence,
                         outputDir = outputDir, 
                         searchDirection = "1to2", 
                         findgRNAs = c(FALSE, TRUE), 
                         overwrite = TRUE,
                         max.mismatch = 10,
                         scoring.method = "CFDscore")

1) How long has to be a flanking sequence? 2) My guide can have a perfect targeting site but can also have an off-target in the sequence but after running above code "gRNAefficacy" is always 0 with different guides. Can I change something in my code to see "gRNAefficacy"?

ADD REPLY
0
Entering edit mode

You are welcome, Dawid!

Please try to set findRNAs = c(TRUE, TRUE) and include some flanking sequences around the gRNAs. To have the efficacy calculated, minimum of 4 bp before the gRNA and 3bp after the PAM are needed.

compare2Sequences(guide, sequence,

                     outputDir = outputDir,

                     searchDirection = "1to2",

                     findgRNAs = c(TRUE, TRUE),

                     overwrite = TRUE,

                     max.mismatch = 10,

                     scoring.method = "CFDscore")

Please let me know if this fixes the issue.

BTW, I am implementing a new function to predict possible indels and their relative frequencies in the target site. The function is called predictRelativeFreqIndels. It will be available in the dev version soon. I thought that you might be interested in this new function.

Thanks!

Best,

Julie

ADD REPLY
0
Entering edit mode

Julie,

Thanks for your reply. If you have a sequence that has potentially 2x sites that can be targeted by the same guide (close to each other) but flanked by different sequences. Do you need to split these sites to test gRNA efficiency?

predictRelativeFreqIndels function sounds great and will be happy to test it!

Dawid

ADD REPLY
0
Entering edit mode

You are welcome, Dawid!

I do not think that you need to split the sequence. You can use the same sequence for the first two parameters. You are welcome to send me the sequence to test if needed.

Thanks for offering test the new function! I got sidetracked and have not finished the testing yet.

Will post here once the function is available for testing.

Best regards,

Julie

ADD REPLY
0
Entering edit mode

When I changed to findgRNAs = c(TRUE, TRUE) I am getting this error:

search for gRNAs for input file1...
Validating input ...
Searching for gRNAs ...
<simpleError in findgRNAs(inputFilePath, overlap.gRNA.positions = overlap.gRNA.positions,     baseEditing = baseEditing, targetBase = targetBase, editingWindow = editingWindow,     primeEditing = primeEditing, findPairedgRNAOnly = findPairedgRNAOnly,     annotatePaired = annotatePaired, paired.orientation = paired.orientation,     pairOutputFile = pairOutputFile, PAM = PAM, PAM.location = PAM.location,     gRNA.pattern = gRNA.pattern, PAM.size = PAM.size, gRNA.size = gRNA.size,     min.gap = min.gap, max.gap = max.gap, name.prefix = gRNA.name.prefix,     format = format, featureWeightMatrixFile = featureWeightMatrixFile,     baseBeforegRNA = baseBeforegRNA, baseAfterPAM = baseAfterPAM,     calculategRNAEfficacy = TRUE, efficacyFile = efficacyFile,     rule.set = rule.set): The input file contains no sequence! This could be caused by 
            wrong format of the file. If file is created in mac, you could 
            reformat to text by typing tr "\r" "\n" >newfile in the 
            command line>
[1] "Scoring ..."
Error in searchHits(gRNAs = gRNAs1, PAM = PAM, PAM.pattern = PAM.pattern,  : 
  object 'gRNAs1' not found
ADD REPLY
0
Entering edit mode

When I perform the code below, then I see "gRNAefficacy" always 0 in output file.

inputFile1Path=DNAStringSet("AATGATACGGCGACCACCGA")
names(inputFile1Path) <- "seq1"

inputFile2Path=DNAStringSet("atgAATGATACGGCGACCACCGAAGGgcgAATGATACGGCCTCCACCGATGGaaat")
names(inputFile2Path) <- "seq2"

test <- compare2Sequences(inputFile1Path, 
                  inputFile2Path,
                  outputDir = outputDir, 
                  searchDirection = "1to2", 
                  findgRNAs = c(FALSE, TRUE), 
                  overwrite = TRUE,
                  max.mismatch = 10,
                  scoring.method = "CFDscore")
ADD REPLY
0
Entering edit mode

Hi Dawid.

Both input sequences need to contain the PAM sequence when set findgRNAs = c(TRUE, TRUE).

Please try to add 4 bp before the gRNA and add the PAM sequence plus 3bp after the PAM sequence in inputFile1Path.

Please change all N to the correct bases in inputFile1Path.

Thanks!

Best,

Julie

inputFile1Path=DNAStringSet("NNNNAATGATACGGCGACCACCGANGGNNN")

names(inputFile1Path) <- "seq1"

inputFile2Path=DNAStringSet("atgAATGATACGGCGACCACCGAAGGgcgAATGATACGGCCTCCACCGATGGaaat")

names(inputFile2Path) <- "seq2"

test <- compare2Sequences(inputFile1Path,

              inputFile2Path,

              outputDir = outputDir,

              searchDirection = "1to2",

              findgRNAs = c(TRUE, TRUE),

              overwrite = TRUE,

              max.mismatch = 10,

              scoring.method = "CFDscore")
ADD REPLY
0
Entering edit mode

Hi Dawid,

If you set scoring.method = "CFDscore", you better set PAM.pattern = "NNG$|NGN$" to include the other types of PAM sequences for offtarget search.

Best, Julie

ADD REPLY
0
Entering edit mode

Hi Dawid, Finally, I incorporated the indel prediction algorithm Lindel into CRISPRseek. You can download the newest version from the dropbox link https://www.dropbox.com/s/f02ikml86i4ehyl/CRISPRseek_1.29.6.tar.gz?dl=0.

Save the file in a directory e.g., ~/Download.

Open the terminal and type the following two commands.

cd ~/Downloads R CMD install CRISPRseek_1.29.6.tar.gz

Alternatively, you can download the newest version at https://github.com/LihuaJulieZhu/CRISPRseek/ and run R CMD build to generate the tar.gz file.

CRISPRseek_1.29.6 is also available as a development version of CRISPRseek at http://bioconductor.org/packages/devel/bioc/html/CRISPRseek.html

There is one example in the offTargetAnalysis help page on how to get the predicted indels and frequencies.

Open R and type Library(CRISPRseek) help(offTargetAnalysis)

You can also type the following command in the same R session to view the example included in the vignette.

vignette("CRISPRseek") 2.13 Scenario 13. Predict indels and their frequencies for Cas9 genome editing systems)

Thanks for offering to test the new functions! Any suggestions and feedbacks are appreciated.

Best regards,

Julie

ADD REPLY
1
Entering edit mode

Thanks! Will test this soon!

ADD REPLY

Login before adding your answer.

Traffic: 720 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6