Question: CRISPRseek Unusual Error
0
gravatar for Dawid G. Nowak
3 months ago by
United States
Dawid G. Nowak10 wrote:

Hi,

I am getting this error below and Summary and Off-targets file are not created. I would appreciate any suggestion.

Thanks, Dawid

Building feature vectors for scoring ...
Calculating scores ...
Error in `$<-.data.frame`(`*tmp*`, "score", value = c(0.0112619406241216,  : 
  replacement has 443126 rows, data has 443130

This the code below:

offTargetAnalysis(inputFilePath,
                      REpatternFile = REpatternFile,
                      scoring.method = "CFDscore",
                      min.score = 0.000001,
                      format = "fasta",
                      findgRNAsWithREcutOnly = FALSE, # if FALSE not restr. enzymes
                      findPairedgRNAOnly = FALSE, # paired only turn to FALSE if only WT Cas9 not Nickase
                      gRNA.name.prefix = "g.",                 
                      orgAnn = orgAnn,
                      BSgenomeName = BSgenomeName,
                      txdb = txdb,
                      chromToSearch= "all", # change here for all to look at all chromosomes
                      min.gap = 0, max.gap = 20,
                      max.mismatch = 4, # changed on 11-26-18 to 4 from 3
                      topN = 1000,
                      topN.OfftargetTotalScore= 10, # 10 top Offtarget will be calculated 
                      annotateExon = TRUE,
                      fetchSequence = TRUE, upstream = 250, downstream = 250,
                      overlap.gRNA.positions = c(17, 18), 
                      PAM.size = 3,
                      PAM = "NGG",
                      foldgRNAs = TRUE,
                      gRNA.size = 20, 
                      outputDir = outputDir,
                      overwrite = TRUE)

>     attached base packages: [1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base     
>     
>     other attached packages:  [1] Hmisc_4.2-0                              ggplot2_3.1.0                            Formula_1.2-3                
> survival_2.43-3                          lattice_0.20-38              
> [6] org.Hs.eg.db_3.7.0                      
> TxDb.Hsapiens.UCSC.hg19.knownGene_3.2.2 
> BSgenome.Hsapiens.UCSC.hg19_1.4.0        org.Mm.eg.db_3.7.0           
> TxDb.Mmusculus.UCSC.mm10.knownGene_3.4.4 [11] GenomicFeatures_1.34.3  
> AnnotationDbi_1.44.0                     Biobase_2.42.0               
> BSgenome.Mmusculus.UCSC.mm10_1.4.0       BSgenome_1.50.0              
> [16] rtracklayer_1.42.1                       GenomicRanges_1.34.0    
> GenomeInfoDb_1.18.1                      seqRFLP_1.0.1                
> CRISPRseek_1.22.1                        [21] seqinr_3.4-5            
> Biostrings_2.50.2                        XVector_0.22.0               
> IRanges_2.16.0                           S4Vectors_0.20.1             
> [26] BiocGenerics_0.28.0
crisprseek • 156 views
ADD COMMENTlink modified 3 months ago by Julie Zhu4.0k • written 3 months ago by Dawid G. Nowak10
Answer: CRISPRseek Unusual Error
2
gravatar for Julie Zhu
3 months ago by
Julie Zhu4.0k
United States
Julie Zhu4.0k wrote:

Dawid,

I have modified the CFD scoring matrix in v1.23.4 to assign mismatch N score as (1 + sum(score of the 3 types of mismatch))/4 for each guide position. The updated changes can be downloaded using git. git clone git@git.bioconductor.org:packages/CRISPRseek

Please let me know if the following code snippet works for you. Thanks!

Best regards,

Julie

library(CRISPRseek)
library(BSgenome.Mmusculus.UCSC.mm10)
library(TxDb.Mmusculus.UCSC.mm10.knownGene)
library(org.Mm.eg.db)

orgAnn <- get("org.Mm.egSYMBOL")
BSgenomeName <- get("Mmusculus")
txdb <- get("TxDb.Mmusculus.UCSC.mm10.knownGene")

outputDir <- getwd()
inputFilePath <-  "~/testseq.fa"

results <- offTargetAnalysis(inputFilePath,
                  #scoring.method = "CFDscore",
                      min.score = 0.000001,
                      format = "fasta",
                      findgRNAsWithREcutOnly = FALSE,
                      findPairedgRNAOnly = FALSE, 
                      gRNA.name.prefix = "g",                 
                      orgAnn = orgAnn,
                      BSgenomeName = BSgenomeName,
                      txdb = txdb,
                      chromToSearch=  "all", 
                      chromToExclude = setdiff(seqlevels(BSgenomeName), paste("chr", c(1:20, "X", "Y", "M"), sep ="")),
                     max.mismatch = 4, 
                      topN = 1000,
                      topN.OfftargetTotalScore= 10,
                      annotateExon = TRUE,
                      fetchSequence = TRUE, upstream = 250, downstream = 250,
                      outputDir = outputDir,
                      overwrite = TRUE)
ADD COMMENTlink written 3 months ago by Julie Zhu4.0k
Answer: CRISPRseek Unusual Error
1
gravatar for Julie Zhu
3 months ago by
Julie Zhu4.0k
United States
Julie Zhu4.0k wrote:

Dawid,

Could you please try to remove foldgRNA and see if the error is still there? Could you please also send me your input file? Thanks!

Best regards,

Julie

ADD COMMENTlink modified 3 months ago • written 3 months ago by Julie Zhu4.0k

Julie,

Thank for your reply. Removing foldgRNA didn't work. When I run only i.e. chromToSearch= "chr19" everything works fine.

Below sequence that tested:

> rightseq
gcggcaggccctccgagcgtggtggagccgttctgtgagacagccgggtacgagtcgtgacgctggaaggggcaagcgggtggtgggcaggaatgcggtccgccctgcagcaaccggagggggagggagaagggagcggaaaagtctccaccggacgcggccatggctcgggggggggggggcagcggaggagcgcttccggccgacgtctcgtcgctgattggcttcttttcctcccgccgtgtgtgaaaacacaaatggcgtgttttggttggcgtaaggcgcctgtcagttaacggcagccggagtgcgcagccgccggcagcctcgctctgcccactgggtggggcgggaggtaggtggggtgaggcgagctggacgtgcgggcgcggtcggcctctggcggggcgggggaggggagggagggtcagcgaaagtagctcgcgcgcgagcggccgcccaccctccccttcctctgggggagtcgttttacccgccgccggccgggcctcgtcgtctgattggctctcggggcccagaaaactggcccttgccattggctcgtgttcgtgcaagttgagtccatccgccggccagcgggggcggcgaggaggcgctcccaggttccggccctcccctcggccccgcgccgcagagtctggccgcgcgcccctgcgcaacgtggcaggaagcgcgcgctgggggcggggacgggcagtagggctgagcggctgcggggcgggtgcaagcacgtttccgacttgagttgcctcaagaggggcgtgctgagccagacctccatcgcgcactccggggagtggagggaaggagcgagggctcagttgggctgttttggaggcaggaagcacttgctctcccaaagtcgctctgagttgttatcagtaagggagctgcagtggagtaggcggggagaaggccgcacccttctccggaggggggaggggagtgttgcaatacctttctgggagttctctgctgcctcctggcttctgaggaccgccctgggcctgggagaatcccttccccctcttccctcgtgatctgcaactccagtctttctag
ADD REPLYlink written 3 months ago by Dawid G. Nowak10

Dawid, thanks for the sequence!

I just tested your code with your sequence chromosome by chromosome and it ran successfully for chr1-20, X and Y. I also ran the analysis with chromToSearch= "all" successfully. If needed, I can share the results with you via DropBox.

Could you please try the following code and see if it runs for you? Thanks!

              library(CRISPRseek)
             library("BSgenome.Hsapiens.UCSC.hg19")
             library(TxDb.Hsapiens.UCSC.hg19.knownGene)
             library(org.Hs.eg.db)
             outputDir <- getwd()
             inputFilePath <-  "~/testseq.fa"
             BSgenomeName = Hsapiens

              txdb = TxDb.Hsapiens.UCSC.hg19.knownGene

              orgAnn = org.Hs.egSYMBOL

results <- offTargetAnalysis(inputFilePath,
                  scoring.method = "CFDscore",
                      min.score = 0.000001,
                      format = "fasta",
                      findgRNAsWithREcutOnly = FALSE,
                      findPairedgRNAOnly = FALSE, 
                      gRNA.name.prefix = "g",                 
                      orgAnn = orgAnn,
                      BSgenomeName = BSgenomeName,
                      txdb = txdb,
                      chromToSearch=  "all", 
                      **chromToExclude = setdiff(seqlevels(BSgenomeName), paste("chr", c(1:20, "X", "Y", "M"), sep ="")),**
                     max.mismatch = 4, 
                      topN = 1000,
                      topN.OfftargetTotalScore= 10,
                      annotateExon = TRUE,
                      fetchSequence = TRUE, upstream = 250, downstream = 250,
                      outputDir = outputDir,
                      overwrite = TRUE)

FYI, I updated all the packages with the following commands, which might be helpful to you. if (!requireNamespace("BiocManager")) install.packages("BiocManager") BiocManager::install()

Here is my sessionInfo. sessionInfo() R version 3.5.0 (2018-04-23) Platform: x86_64-apple-darwin15.6.0 (64-bit) Running under: macOS Sierra 10.12.6

Matrix products: default BLAS: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRblas.0.dylib LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib

locale: [1] enUS.UTF-8/enUS.UTF-8/enUS.UTF-8/C/enUS.UTF-8/en_US.UTF-8

attached base packages: [1] stats4 parallel stats graphics grDevices utils datasets [8] methods base

other attached packages: [1] org.Hs.eg.db3.7.0
[2] TxDb.Hsapiens.UCSC.hg19.knownGene
3.2.2 [3] GenomicFeatures1.34.3
[4] AnnotationDbi
1.44.0
[5] Biobase2.42.0
[6] BSgenome.Hsapiens.UCSC.hg19
1.4.0
[7] BSgenome1.50.0
[8] rtracklayer
1.42.1
[9] GenomicRanges1.34.0
[10] GenomeInfoDb
1.18.2
[11] CRISPRseek1.22.1
[12] Biostrings
2.50.2
[13] XVector0.22.0
[14] IRanges
2.16.0
[15] S4Vectors0.20.1
[16] BiocGenerics
0.28.0

loaded via a namespace (and not attached): [1] Rcpp1.0.0 compiler3.5.0
[3] prettyunits1.0.2 progress1.2.0
[5] bitops1.0-6 tools3.5.0
[7] zlibbioc1.28.0 biomaRt2.38.0
[9] digest0.6.18 bit1.1-14
[11] RSQLite2.1.1 memoise1.1.0
[13] lattice0.20-38 pkgconfig2.0.2
[15] rlang0.3.1 Matrix1.2-15
[17] DelayedArray0.8.0 DBI1.0.0
[19] GenomeInfoDbData1.2.0 httr1.4.0
[21] stringr1.4.0 hms0.4.2
[23] ade41.7-13 bit640.9-7
[25] grid3.5.0 data.table1.12.0
[27] R62.3.0 hash2.2.6
[29] XML3.98-1.17 BiocParallel1.16.6
[31] magrittr1.5 seqinr3.4-5
[33] blob1.1.1 Rsamtools1.34.1
[35] matrixStats0.54.0 GenomicAlignments1.18.1
[37] MASS7.3-51.1 assertthat0.2.0
[39] SummarizedExperiment1.12.0 stringi1.3.1
[41] RCurl1.95-4.11 crayon1.3.4

Best regards,

Julie

ADD REPLYlink modified 3 months ago • written 3 months ago by Julie Zhu4.0k

Julie,

I was not clear that my test was in mouse. I see you tested for human.

Thanks, Dawid

ADD REPLYlink written 3 months ago by Dawid G. Nowak10

Dawid, do you want me test it on in mouse? If yes, could you please post the library loading code? Thanks! Best, Julie

ADD REPLYlink written 3 months ago by Julie Zhu4.0k
# mouse db
library(BSgenome.Mmusculus.UCSC.mm10)
library(TxDb.Mmusculus.UCSC.mm10.knownGene)
library(org.Mm.eg.db)

orgAnn <- get("org.Mm.egSYMBOL")
BSgenomeName <- get("Mmusculus")
txdb <- get("TxDb.Mmusculus.UCSC.mm10.knownGene")

# code
offTargetAnalysis(inputFilePath,
                      REpatternFile = REpatternFile,
                      scoring.method = "CFDscore",
                      min.score = 0.000001,
                      format = "fasta",
                      findgRNAsWithREcutOnly = FALSE, # if FALSE not restr. enzymes
                      findPairedgRNAOnly = FALSE, # paired only turn to FALSE if only WT Cas9 not Nickase
                      gRNA.name.prefix = "g.",                 
                      orgAnn = orgAnn,
                      BSgenomeName = BSgenomeName,
                      txdb = txdb,
                      chromToSearch= "all", # change here for all to look at all chromosomes
                      min.gap = 0, max.gap = 20,
                      max.mismatch = 4, # changed on 11-26-18 to 4 from 3
                      topN = 1000,
                      topN.OfftargetTotalScore= 10, # 10 top Offtarget will be calculated 
                      annotateExon = TRUE,
                      fetchSequence = TRUE, upstream = 250, downstream = 250,
                      overlap.gRNA.positions = c(17, 18), 
                      PAM.size = 3,
                      PAM = "NGG",
                      foldgRNAs = TRUE,
                      gRNA.size = 20, 
                      outputDir = outputDir,
                      overwrite = TRUE)

sessionInfo() R version 3.5.1 (2018-07-02) Platform: x8664-apple-darwin15.6.0 (64-bit) Running under: macOS 10.14.3 Matrix products: default BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib locale: [1] enUS.UTF-8/enUS.UTF-8/enUS.UTF-8/C/enUS.UTF-8/enUS.UTF-8 attached base packages: [1] stats4 parallel stats graphics grDevices utils datasets methods base other attached packages: [1] Hmisc4.2-0 ggplot23.1.0 Formula1.2-3 survival2.43-3 lattice0.20-38 [6] org.Hs.eg.db3.7.0 TxDb.Hsapiens.UCSC.hg19.knownGene3.2.2 BSgenome.Hsapiens.UCSC.hg191.4.0 org.Mm.eg.db3.7.0 TxDb.Mmusculus.UCSC.mm10.knownGene3.4.4 [11] GenomicFeatures1.34.3 AnnotationDbi1.44.0 Biobase2.42.0 BSgenome.Mmusculus.UCSC.mm101.4.0 BSgenome1.50.0 [16] rtracklayer1.42.1 GenomicRanges1.34.0 GenomeInfoDb1.18.1 seqRFLP1.0.1 CRISPRseek1.22.1 [21] seqinr3.4-5 Biostrings2.50.2 XVector0.22.0 IRanges2.16.0 S4Vectors0.20.1 [26] BiocGenerics0.28.0 loaded via a namespace (and not attached): [1] bitops1.0-6 matrixStats0.54.0 bit640.9-7 RColorBrewer1.1-2 progress1.2.0 httr1.4.0 backports1.1.3 [8] tools3.5.1 R62.3.0 rpart4.1-13 DBI1.0.0 lazyeval0.2.1 colorspace1.4-0 nnet7.3-12 [15] ade41.7-13 GetoptLong0.1.7 withr2.1.2 gridExtra2.3 tidyselect0.2.5 prettyunits1.0.2 bit1.1-14 [22] compiler3.5.1 htmlTable1.13.1 DelayedArray0.8.0 checkmate1.9.1 scales1.0.0 stringr1.4.0 digest0.6.18 [29] Rsamtools1.34.1 foreign0.8-71 htmltools0.3.6 base64enc0.1-3 pkgconfig2.0.2 htmlwidgets1.3 rlang0.3.1 [36] GlobalOptions0.1.0 rstudioapi0.9.0 RSQLite2.1.1 shape1.4.4 bindr0.1.1 BiocParallel1.16.2 acepack1.4.1 [43] dplyr0.7.8 RCurl1.95-4.11 magrittr1.5 GenomeInfoDbData1.2.0 Matrix1.2-15 Rcpp1.0.0 munsell0.5.0 [50] stringi1.2.4 yaml2.2.0 MASS7.3-51.1 SummarizedExperiment1.12.0 zlibbioc1.28.0 plyr1.8.4 grid3.5.1 [57] blob1.1.1 crayon1.3.4 splines3.5.1 hash2.2.6 circlize0.4.5 hms0.4.2 knitr1.21 [64] ComplexHeatmap1.20.0 pillar1.3.1 rjson0.2.20 biomaRt2.38.0 XML3.98-1.17 glue1.3.0 latticeExtra0.6-28 [71] data.table1.12.0 BiocManager1.30.4 gtable0.2.0 purrr0.3.0 assertthat0.2.0 xfun0.4 tibble2.0.1 [78] GenomicAlignments1.18.1 memoise1.1.0 cluster2.0.7-1 bindrcpp_0.2.2

ADD REPLYlink written 3 months ago by Dawid G. Nowak10

Dawid, do you want me test it on in mouse? If yes, could you please post the library loading code? Thanks! Best, Julie

ADD REPLYlink written 3 months ago by Julie Zhu4.0k
Answer: CRISPRseek Unusual Error
1
gravatar for Julie Zhu
3 months ago by
Julie Zhu4.0k
United States
Julie Zhu4.0k wrote:

Dawid,

It turns out that one of the offtargets located at chr1:171228616-171228638 in mm10 has sequence NAGNGCAGGGGAGGGGAGGGGAG that contains two Ns. This type of mismatch does not have penalty score defined in the CFD score system.

I will update the code to simply remove offtargets with at least one N. Do you think this is a proper workaround? Thanks!

Best regards,

Julie

ADD COMMENTlink modified 3 months ago • written 3 months ago by Julie Zhu4.0k

Julie,

Thank you for finding this exception! I think your solution sounds good.

As a side questions 1) Can "Hsu-Zhang" scoring handle this type of mismatches? 2) Can "chromToExclude =" exclude specific sequence on the chromosome i.e. chr1:171,228,616-171,228,638

Thanks, Dawid

ADD REPLYlink written 3 months ago by Dawid G. Nowak10
1

Dawid,

Thank you for your input! I am also thinking of another way which is to create N mismatch penalty scores by averaging different type of mismatches/matches for each nucleotide. I will see if this is not too cumbersome to implement.

Yes, "Hsu-Zhang" scoring method handles this type of mismatches fine

"chromToExclude =" can exclude specific chromosomes not specific sequences on the chromosome at this stage.

Best regards,

Julie

ADD REPLYlink written 3 months ago by Julie Zhu4.0k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 140 users visited in the last hour