Error in data.frame(..., check.names = FALSE) : arguments imply differing number of rows
1
0
Entering edit mode
Miguel • 0
@a6d8df64
Last seen 22 months ago
Spain

I have a list of sequences and want to identify gRNAs, and their CFD and on-target efficacy using offTargetAnalysis(). However, I am facing an error.

script:

library(CRISPRseek) # 1.34
library(BSgenome.Hsapiens.UCSC.hg19)
library(TxDb.Hsapiens.UCSC.hg19.knownGene)
library(org.Hs.eg.db)

offTargetAnalysis(inputFilePath = "test.fasta",
                  format = "fasta",
                  header = FALSE,
                  exportAllgRNAs = "fasta",
                  findgRNAs = TRUE,
                  findgRNAsWithREcutOnly = FALSE,
                  findPairedgRNAOnly = FALSE,
                  annotatePaired = FALSE,
                  annotateExon = TRUE,
                  scoring.method = "CFDscore",
                  min.score = 0,
                  topN = 100,
                  topN.OfftargetTotalScore = 10,
                  calculategRNAefficacyForOfftargets = T,
                  PAM = "NGG",
                  PAM.pattern = "NNG$|NGN$",
                  PAM.location = "3prime",
                  allowed.mismatch.PAM = 1,
                  PAM.size = 3,
                  gRNA.size = 20,
                  baseBeforegRNA = 4,
                  baseAfterPAM = 3,
                  rule.set = "Root_RuleSet2_2016",
                  chromToSearch = c("chr21"),
                  max.mismatch = 4,
                  BSgenomeName = BSgenome.Hsapiens.UCSC.hg19,
                  txdb = TxDb.Hsapiens.UCSC.hg19.knownGene,
                  orgAnn = org.Hs.egSYMBOL,
                  enable.multicore = TRUE,
                  n.cores.max = 1,
                  outputDir = ".",
                  overwrite = T)

and the input "test.fasta" contains 2 fake 100bp sequences. The first one does not contain any PAM, while the second one contains just 1 PAM so that CRISPRseek identify just 1 sgRNA there:

>KLHL17_chr1_896281_896380
ACTGTTGATGTCTTGACTCATGTGCTGAGCTGTGTCTGAACTGAGTATGTTACACAAACGCGACACGCGCGAACATGACGCGACTAACGCTGCTGTAACG
>KLHL17_chr1_896331_896430
TACACAAACGCGACACGCGCGAACAAGACGCGACTAACGCTGCAGTAACGAGAAAGCAGCTAAAGACGGAGAAGAGCTGAGCTCGTAGAAGCGACAAGAA

log:

> library(CRISPRseek) # v. 1.34.0
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: ‘BiocGenerics’

The following objects are masked from ‘package:parallel’:

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ, clusterExport, clusterMap, parApply,
    parCapply, parLapply, parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked from ‘package:stats’:

    IQR, mad, sd, var, xtabs

The following objects are masked from ‘package:base’:

    anyDuplicated, append, as.data.frame, basename, cbind, colnames, dirname, do.call, duplicated, eval,
    evalq, Filter, Find, get, grep, grepl, intersect, is.unsorted, lapply, Map, mapply, match, mget, order,
    paste, pmax, pmax.int, pmin, pmin.int, Position, rank, rbind, Reduce, rownames, sapply, setdiff, sort,
    table, tapply, union, unique, unsplit, which, which.max, which.min

Loading required package: Biostrings
Loading required package: S4Vectors
Loading required package: stats4

Attaching package: ‘S4Vectors’

The following object is masked from ‘package:base’:

    expand.grid

Loading required package: IRanges
Loading required package: XVector

Attaching package: ‘Biostrings’

The following object is masked from ‘package:base’:

    strsplit

> library(BSgenome.Hsapiens.UCSC.hg19)
Loading required package: BSgenome
Loading required package: GenomeInfoDb
Loading required package: GenomicRanges
Loading required package: rtracklayer
> library(TxDb.Hsapiens.UCSC.hg19.knownGene)
Loading required package: GenomicFeatures
Loading required package: AnnotationDbi
Loading required package: Biobase
Welcome to Bioconductor

    Vignettes contain introductory material; view with 'browseVignettes()'. To cite Bioconductor, see
    'citation("Biobase")', and for packages 'citation("pkgname")'.

> library(org.Hs.eg.db)

> 
> offTargetAnalysis(inputFilePath = "test.fasta",
+                   format = "fasta",
+                   header = FALSE,
+                   exportAllgRNAs = "fasta",
+                   findgRNAs = TRUE,
+                   findgRNAsWithREcutOnly = FALSE,
+                   findPairedgRNAOnly = FALSE,
+                   annotatePaired = FALSE,
+                   annotateExon = TRUE,
+                   scoring.method = "CFDscore",
+                   min.score   = 0,
+                   topN = 100,
+                   topN.OfftargetTotalScore = 10,
+                   calculategRNAefficacyForOfftargets = T,
+                   PAM = "NGG",
+                   PAM.pattern = "NNG$|NGN$",
+                   PAM.location = "3prime",
+                   allowed.mismatch.PAM = 1,
+                   PAM.size = 3,
+                   gRNA.size = 20,
+                   baseBeforegRNA = 4,
+                   baseAfterPAM = 3,
+                   rule.set = "Root_RuleSet2_2016",
+                   chromToSearch = c("chr21"),
+                   max.mismatch = 4,
+                   BSgenomeName = BSgenome.Hsapiens.UCSC.hg19,
+                   txdb = TxDb.Hsapiens.UCSC.hg19.knownGene,
+                   orgAnn = org.Hs.egSYMBOL,
+                   enable.multicore = TRUE,
+                   n.cores.max = 1,
+                   outputDir = ".",
+                   overwrite = T)
Validating input ...
Searching for gRNAs ...
No gRNAs found in the input sequence KLHL17_chr1_896281_896380>>> Finding all hits in sequence chr21 ...
>>> DONE searching
Building feature vectors for scoring ...
Calculating scores ...
Annotating, filtering and generating reports ...
Calculates on-target scores for sgRNAs with NGG PAM only.
Calculates on-target scores for sgRNAs with NGG PAM only.
Calculates on-target scores for sgRNAs with NGG PAM only.
Calculates on-target scores for sgRNAs with NGG PAM only.
Calculates on-target scores for sgRNAs with NGG PAM only.
Calculates on-target scores for sgRNAs with NGG PAM only.
Calculates on-target scores for sgRNAs with NGG PAM only.
Calculates on-target scores for sgRNAs with NGG PAM only.
Calculates on-target scores for sgRNAs with NGG PAM only.
Calculates on-target scores for sgRNAs with NGG PAM only.
Calculates on-target scores for sgRNAs with NGG PAM only.
Calculates on-target scores for sgRNAs with NGG PAM only.
Calculates on-target scores for sgRNAs with NGG PAM only.
Calculates on-target scores for sgRNAs with NGG PAM only.
Calculates on-target scores for sgRNAs with NGG PAM only.
Calculates on-target scores for sgRNAs with NGG PAM only.
Calculates on-target scores for sgRNAs with NGG PAM only.
Calculates on-target scores for sgRNAs with NGG PAM only.
Calculates on-target scores for sgRNAs with NGG PAM only.
Calculates on-target scores for sgRNAs with NGG PAM only.
Calculates on-target scores for sgRNAs with NGG PAM only.
Calculates on-target scores for sgRNAs with NGG PAM only.
Calculates on-target scores for sgRNAs with NGG PAM only.
Calculates on-target scores for sgRNAs with NGG PAM only.
Calculates on-target scores for sgRNAs with NGG PAM only.
Calculates on-target scores for sgRNAs with NGG PAM only.
Calculates on-target scores for sgRNAs with NGG PAM only.
Calculates on-target scores for sgRNAs with NGG PAM only.
Calculates on-target scores for sgRNAs with NGG PAM only.
Calculates on-target scores for sgRNAs with NGG PAM only.
Error in data.frame(..., check.names = FALSE) : 
  arguments imply differing number of rows: 34, 4

And it only outputs the sgRNA found:

>KLHL17_chr1_896331_896430_gR63f
AACGAGAAAGCAGCTAAAGACGG

and OfftargetAnalysis.xls (with no efficacy score)

name    gRNAPlusPAM OffTargetSequence   score   n.mismatch  mismatch.distance2PAM   alignment   NGG forViewInUCSC   strand  chrom   chromStart  chromEnd
KLHL17_chr1_896331_896430_gR63f AACGAGAAAGCAGCTAAAGANGG AATGAGCCAGCAGCTAAAAAAGG 0.092076    4   18,14,13,2  ..T...CC..........A.    1   chr21:25307616-25307638 -   chr21   25307616    25307638
KLHL17_chr1_896331_896430_gR63f AACGAGAAAGCAGCTAAAGANGG AACGAAAAAGCAGCTATACACGG 0.05977 3   15,4,2  .....A..........T.C.    1   chr21:39939357-39939379 +   chr21   39939357    39939379
KLHL17_chr1_896331_896430_gR63f AACGAGAAAGCAGCTAAAGANGG AAGGAGAAAGAAACAAAAGAGAG 0.051957    4   18,10,8,6   ..G.......A.A.A.....    0   chr21:39091970-39091992 -   chr21   39091970    39091992
KLHL17_chr1_896331_896430_gR63f AACGAGAAAGCAGCTAAAGANGG AAAGAGCAAGCATCTAGAGAAGG 0.020074    4   18,14,8,4   ..A...C.....T...G...    1   chr21:41140041-41140063 +   chr21   41140041    41140063
KLHL17_chr1_896331_896430_gR63f AACGAGAAAGCAGCTAAAGANGG AAAGAGAAAGAAGCTAAGTAGCG 0.018571    4   18,10,3,2   ..A.......A......GT.    0   chr21:18242330-18242352 +   chr21   18242330    18242352
KLHL17_chr1_896331_896430_gR63f AACGAGAAAGCAGCTAAAGANGG AAGGAGAAAGCAGCTAACTAAAG 0.016461    3   18,3,2  ..G..............CT.    0   chr21:14410985-14411007 -   chr21   14410985    14411007
KLHL17_chr1_896331_896430_gR63f AACGAGAAAGCAGCTAAAGANGG AAGGAGAATGCAGATAATGACAG 0.013611    4   18,12,7,3   ..G.....T....A...T..    0   chr21:18794777-18794799 +   chr21   18794777    18794799
KLHL17_chr1_896331_896430_gR63f AACGAGAAAGCAGCTAAAGANGG AACAAGAAAGAAGATAAAGGAGA 0.012546    4   17,10,7,1   ...A......A..A.....G    0   chr21:46790270-46790292 -   chr21   46790270    46790292
KLHL17_chr1_896331_896430_gR63f AACGAGAAAGCAGCTAAAGANGG AAGAAGAAAGCAACAAAAGACTG 0.00937 4   18,17,8,6   ..GA........A.A.....    0   chr21:9769132-9769154   +   chr21   9769132 9769154
KLHL17_chr1_896331_896430_gR63f AACGAGAAAGCAGCTAAAGANGG TAAGAGAAAGCAGCAGAAGAAGA 0.006701    4   20,18,6,5   T.A...........AG....    0   chr21:23268009-23268031 -   chr21   23268009    23268031
KLHL17_chr1_896331_896430_gR63f AACGAGAAAGCAGCTAAAGANGG CATGAGAAAACAGCCAAAGAGTG 0.005844    4   20,18,11,6  C.T......A....C.....    0   chr21:43529622-43529644 -   chr21   43529622    43529644
KLHL17_chr1_896331_896430_gR63f AACGAGAAAGCAGCTAAAGANGG AACAATAAAGCAGATAAAAAATG 0.005844    4   17,15,7,2   ...A.T.......A....A.    0   chr21:15705064-15705086 -   chr21   15705064    15705086
KLHL17_chr1_896331_896430_gR63f AACGAGAAAGCAGCTAAAGANGG AATGAGAAAGAAGCAAAAGCGGA 0.004711    4   18,10,6,1   ..T.......A...A....C    0   chr21:31861017-31861039 +   chr21   31861017    31861039
KLHL17_chr1_896331_896430_gR63f AACGAGAAAGCAGCTAAAGANGG AAGCAGTAAGCAGCAAAAGATGA 0.004656    4   18,17,14,6  ..GC..T.......A.....    0   chr21:24682610-24682632 -   chr21   24682610    24682632
KLHL17_chr1_896331_896430_gR63f AACGAGAAAGCAGCTAAAGANGG AAGGAGAAAGTAGTAAAAGAGGA 0.004536    4   18,10,7,6   ..G.......T..TA.....    0   chr21:14793644-14793666 -   chr21   14793644    14793666
KLHL17_chr1_896331_896430_gR63f AACGAGAAAGCAGCTAAAGANGG AACAAGAAAAGAGCTAAAAAAGC 0.003333    4   17,11,10,2  ...A.....AG.......A.    0   chr21:42374829-42374851 -   chr21   42374829    42374851
KLHL17_chr1_896331_896430_gR63f AACGAGAAAGCAGCTAAAGANGG AAGGCGAAAGCAGCTAATTACTG 0.003247    4   18,16,3,2   ..G.C............TT.    0   chr21:34287796-34287818 +   chr21   34287796    34287818
KLHL17_chr1_896331_896430_gR63f AACGAGAAAGCAGCTAAAGANGG ATCCAGAAAGGAGCTAAACAGGA 0.002996    4   19,17,10,2  .T.C......G.......C.    0   chr21:24218022-24218044 -   chr21   24218022    24218044
KLHL17_chr1_896331_896430_gR63f AACGAGAAAGCAGCTAAAGANGG AAAGAGAAAGAAGAGAAAGAAAG 0.002949    4   18,10,7,6   ..A.......A..AG.....    0   chr21:19738033-19738055 -   chr21   19738033    19738055
KLHL17_chr1_896331_896430_gR63f AACGAGAAAGCAGCTAAAGANGG AAAGAGAAAGCAGATGGAGACAG 0.002669    4   18,7,5,4    ..A..........A.GG...    0   chr21:36289576-36289598 +   chr21   36289576    36289598
KLHL17_chr1_896331_896430_gR63f AACGAGAAAGCAGCTAAAGANGG AACGAGGAAGCAGAGACAGAAGG 0.00218 4   14,7,6,4    ......G......AG.C...    1   chr21:47533895-47533917 -   chr21   47533895    47533917
KLHL17_chr1_896331_896430_gR63f AACGAGAAAGCAGCTAAAGANGG AACCAAAAAAGAGCTAAAGATGT 0.001992    4   17,15,11,10 ...C.A...AG.........    0   chr21:17712259-17712281 -   chr21   17712259    17712281
KLHL17_chr1_896331_896430_gR63f AACGAGAAAGCAGCTAAAGANGG AAGGAGAAAATAGCAAAAGATGC 0.001847    4   18,11,10,6  ..G......AT...A.....    0   chr21:24503199-24503221 -   chr21   24503199    24503221
KLHL17_chr1_896331_896430_gR63f AACGAGAAAGCAGCTAAAGANGG GACTAGAAACCAGCTAGAGATGA 0.001783    4   20,17,11,4  G..T.....C......G...    0   chr21:31991329-31991351 -   chr21   31991329    31991351
KLHL17_chr1_896331_896430_gR63f AACGAGAAAGCAGCTAAAGANGG AAAGAGAAAGCATCAAAATATGT 0.001619    4   18,8,6,2    ..A.........T.A...T.    0   chr21:28564510-28564532 +   chr21   28564510    28564532
KLHL17_chr1_896331_896430_gR63f AACGAGAAAGCAGCTAAAGANGG AACGCAAAAGCAGCTATAGCTGA 0.001052    4   16,15,4,1   ....CA..........T..C    0   chr21:17150758-17150780 +   chr21   17150758    17150780
KLHL17_chr1_896331_896430_gR63f AACGAGAAAGCAGCTAAAGANGG AACAAGAAACCAGCTAGAGATGT 0.001025    3   17,11,4 ...A.....C......G...    0   chr21:46170832-46170854 -   chr21   46170832    46170854
KLHL17_chr1_896331_896430_gR63f AACGAGAAAGCAGCTAAAGANGG AAAGAGAAAGAAGAGAAAGAAGA 0.00079 4   18,10,7,6   ..A.......A..AG.....    0   chr21:32995118-32995140 -   chr21   32995118    32995140
KLHL17_chr1_896331_896430_gR63f AACGAGAAAGCAGCTAAAGANGG GACGAGGAAGCAGCCAGAGAGGC 0.000755    4   20,14,6,4   G.....G.......C.G...    0   chr21:27281134-27281156 +   chr21   27281134    27281156
KLHL17_chr1_896331_896430_gR63f AACGAGAAAGCAGCTAAAGANGG AAAGAGAAAGCAGAAAGAGAAGT 0.0005  4   18,7,6,4    ..A..........AA.G...    0   chr21:14756568-14756590 -   chr21   14756568    14756590
KLHL17_chr1_896331_896430_gR63f AACGAGAAAGCAGCTAAAGANGG AAACAGAAAGCAGCTGGAGAGGC 0.000346    4   18,17,5,4   ..AC...........GG...    0   chr21:40940321-40940343 -   chr21   40940321    40940343
KLHL17_chr1_896331_896430_gR63f AACGAGAAAGCAGCTAAAGANGG ATCGAGAAAGAAGGCAAAGACCG 0   4   19,10,7,6   .T........A..GC.....    0   chr21:14733282-14733304 +   chr21   14733282    14733304
KLHL17_chr1_896331_896430_gR63f AACGAGAAAGCAGCTAAAGANGG AAAGAGAAAGCAATTTAAGAGTG 0   4   18,8,7,5    ..A.........AT.T....    0   chr21:26276443-26276465 +   chr21   26276443    26276465
KLHL17_chr1_896331_896430_gR63f AACGAGAAAGCAGCTAAAGANGG AATGAGCAAGCAGCTCAGGACTG 0   4   18,14,5,3   ..T...C........C.G..    0   chr21:44004290-44004312 -   chr21   44004290    44004312

But it does not create a Summary.xls.

I've noticed that there are 34 offtargets in OfftargetAnalysis.xls, while the log stops after printing 30 "Calculates on-target scores for sgRNAs with NGG PAM only.". The error is "differing number of rows: 34, 4", so maybe that is some hint.

I'm using python 2.7.5. Both for R 3.6.0 and R 4.1.1 the same error occurs.

Second question below

CRISPRseek • 3.0k views
ADD COMMENT
0
Entering edit mode

Also, if I choose to not calculate the on-target efficacy for offtarget sequences, but still calculate it for the actual gRNAs (I think this is what the following command does), i.e.

calculategRNAefficacyForOfftargets = F

and even if test.fasta consists on a real sequence with potential gRNAs:

>KLHL17_chr1_896281_896380
CCGGGGGAGGTCGGGACTCAGGTGCGGAGCGGGGTCGGCCCGGAGTAGGTTCCCCACCCGCGCCCCGCGCGCCCAGGACGCGACTCCCGCTGCGGTCCCG

it identifies the gRNAs, and creates both the OfftargetAnalysis.xls and Summary.xls with no error returned, but the on.target info is empty:

$on.target
 [1] name                  gRNAPlusPAM           OffTargetSequence     inExon                inIntron             
 [6] entrez_id             gene                  score                 n.mismatch            mismatch.distance2PAM
[11] alignment             isCanonicalPAM        forViewInUCSC         strand                chrom                
[16] chromStart            chromEnd              flankSequence        
<0 rows> (or 0-length row.names)

and the Summary.xls does not have the efficacy score (or efficiency) column for the found gRNAs, and it even says "perfect match not found" for all of them, while they all come from a real sequence:

names   gRNAsPlusPAM    top5OfftargetTotalScore top10OfftargetTotalScore    top1Hit.onTarget.MMdistance2PAM topOfftarget1MMdistance2PAM topOfftarget2MMdistance2PAM topOfftarget3MMdistance2PAM topOfftarget4MMdistance2PAM topOfftarget5MMdistance2PAM topOfftarget6MMdistance2PAM topOfftarget7MMdistance2PAM topOfftarget8MMdistance2PAM topOfftarget9MMdistance2PAM topOfftarget10MMdistance2PAM    REname
KLHL17_chr1_896281_896380_gR21f GGGAGGTCGGGACTCAGGTGNGG 0.125473    0.129407    perfect match not found 18,14,9,1   20,17,9 17,14,13,8  18,15,12,9  14,13,10,8  18,12,8,2   19,13,9,2   15,12,11,6  20,13,6,5   13,9,6,2    LpnPI
KLHL17_chr1_896281_896380_gR26f GTCGGGACTCAGGTGCGGAGNGG 0.311637    0.322834    perfect match not found 18,11,10,1  15,14,12,5  15,11,4,2   19,13,12,1  14,13,12,5  19,18,10,4  18,4,3,1    14,5,3,2    18,6,4,2        AciI MspJI
KLHL17_chr1_896281_896380_gR27f TCGGGACTCAGGTGCGGAGCNGG 0.18589 0.18589 perfect match not found 19,17,15,8  20,12,8,3   20,13,5,4   19,5,4,2    20,18,5,2                       BsrBI MspJI
KLHL17_chr1_896281_896380_gR28f CGGGACTCAGGTGCGGAGCGNGG 0.149956    0.149956    perfect match not found 15,14,3,2   20,12,6,2   19,15,9,7   20,9,6,4    20,7,6,2    14,6,5,1                    BsrBI
KLHL17_chr1_896281_896380_gR32f ACTCAGGTGCGGAGCGGGGTNGG 0.313269    0.313269    perfect match not found 20,16,10    19,18,10,3  11,8,6,1    18,8,5,1                            FauI MspJI
KLHL17_chr1_896281_896380_gR37f GGTGCGGAGCGGGGTCGGCCNGG 0.104075    0.124972    perfect match not found 18,15,11,5  16,8,7,5    18,13,6,2   19,17,15,5  15,5    19,16,12,4  15,13,9,6   13,12,6,1   19,16,9,6   19,15,12,6  CviKI-1 HaeIII MspJI MwoI Sau96I
KLHL17_chr1_896281_896380_gR43f GAGCGGGGTCGGCCCGGAGTNGG 0.145384    0.145384    perfect match not found 20,19,12,8  18,17,11,7  16,11,8,3                               Aco12261II BslI
KLHL17_chr1_896281_896380_gR45r GCGGGTGGGGAACCTACTCCNGG 0.019874    0.019874    perfect match not found 16,15,9,4   20,18,5,4   18,9,7,6    16,9,6,3    19,9,7,6    19,16,14,5  18,16,9,5   19,16,14,5  14,8,6,5        Aco12261II BslI
KLHL17_chr1_896281_896380_gR46r CGCGGGTGGGGAACCTACTCNGG 0.005641    0.005641    perfect match not found 19,17,10,6  19,18,11,7                                  
KLHL17_chr1_896281_896380_gR58r GGGCGCGCGGGGCGCGGGTGNGG 1.361083    1.939101    perfect match not found 14,13,11,5  13,11,8 13,11,8 19,15,4 19,15,13,4  19,15,13,4  19,15,4 19,15,13,4  13,11,8 13,11,8 FauI
KLHL17_chr1_896281_896380_gR59r TGGGCGCGCGGGGCGCGGGTNGG 0.476274    0.520152    perfect match not found 13,12,10,4  16,13,10,7  18,14,3 16,12,10,1  20,18,14,3  15,14,7,1   20,12,4,1   9,5,3,1 16,12,7,3   20,12,10,7  AciI FauI
KLHL17_chr1_896281_896380_gR60r CTGGGCGCGCGGGGCGCGGGNGG 1.613916    2.176   perfect match not found 20,15,14,13 19,16,13,8  19,13,4 20,17,15,9  20,19,17,9  19,13,4,2   19,13,8,2   19,18,14,3  17,15,11,4  19,14,13,8  AciI BstUI FauI MspJI
KLHL17_chr1_896281_896380_gR63r GTCCTGGGCGCGCGGGGCGCNGG 0.947784    1.114262    perfect match not found 17,16,14,6  20,16,15,11 16,14,11,7  12,11,8,3   19,16,9,4   20,19,14,12 15,14,9,1   18,16,10,1  19,6,3,1    16,13,6,3   HhaI MwoI
KLHL17_chr1_896281_896380_gR64r CGTCCTGGGCGCGCGGGGCGNGG 0.859401    0.976338    perfect match not found 18,16,15,11 20,15,7,1   20,15,7,1   17,15,9 20,18,5,2   18,15,2,1   18,16,15,6  17,16,15,9  17,16,15,9  17,16,15,9  MwoI
KLHL17_chr1_896281_896380_gR69r AGTCGCGTCCTGGGCGCGCGNGG 0.009918    0.009918    perfect match not found 17,8,6,2    16,13,6,4   13,10,4,3   16,15,8,6                           BssHII BstUI Cac8I MspJI
KLHL17_chr1_896281_896380_gR70r GAGTCGCGTCCTGGGCGCGCNGG 0.02119 0.02119 perfect match not found 17,15,13,2                                      BssHII BstUI Cac8I HhaI MspJI
KLHL17_chr1_896281_896380_gR71f CACCCGCGCCCCGCGCGCCCNGG 0.702269    0.96116 perfect match not found 19,16,12,7  19,18,14,13 19,13,6,5   15,11,8,4   15,8,6,3    19,17,11,6  19,17,4,2   19,15,8,6   19,16,5,1   19,11,6,3   BssHII Cac8I
KLHL17_chr1_896281_896380_gR71r GGAGTCGCGTCCTGGGCGCGNGG 0.054672    0.054672    perfect match not found 16,15,13,12 16,12,4,1   13,11,3,2                               BssHII BstUI Cac8I HhaI MspJI
KLHL17_chr1_896281_896380_gR78r CGCAGCGGGAGTCGCGTCCTNGG 0.021392    0.021392    perfect match not found 15,9,7,6    20,11,8,4   19,17,8,4   19,8,6,4                            HgaI
KLHL17_chr1_896281_896380_gR79r CCGCAGCGGGAGTCGCGTCCNGG 0.038951    0.038951    perfect match not found 14,8,6,5    18,14,8 19,16,14,1  16,13,11,3                          HgaI
KLHL17_chr1_896281_896380_gR7r  ACCTGAGTCCCGACCTCCCCNGG 0.452953    0.462911    perfect match not found 14,12,9,1   17,16,13,11 18,17,16,11 17,16,10,8  19,15,13,9  18,14,11,9  20,17,11,8  20,17,10,8  13,8,4,1    20,9,4,3    
KLHL17_chr1_896281_896380_gR89f CCAGGACGCGACTCCCGCTGNGG 0.335231    0.372446    perfect match not found 20,10,4 13,12,10,4  18,14,13,12 17,14,12,5  20,19,8,2   20,15,13,11 17,13,12,4  20,11,4,3   20,12,4,1   12,8,5,3    AciI BbvI FauI Fnu4HI MspA1I MwoI ApeKI

Thanks a lot for your help,

Miguel

ADD REPLY
0
Entering edit mode
Kai Hu ▴ 70
@kai
Last seen 8 weeks ago
Worcester

Hello Miguel,

For your first question, the error is caused by "Root_RuleSet2_2016" only outputting gRNAefficiency scores for OffTargetSequences with NGG PAM. Basically, if there is no NGG PAM, "RuleSet2" will NOT calculate an gRNAefficiency score, and instead, it pops out the message "Calculates on-target scores for sgRNAs with NGG PAM only.", which can not be properly handled by Summarly.xls generation function.

In your case, there is one sgRNA with 34 OffTargetSequences detected, but only 4 of them have NGG PAM, and thus 4 scores calculated, that is also why you see 30 lines of "Calculates on-target scores for sgRNAs with NGG PAM only."

To fix the issue, you can choose to install the dev version of CRISPRseek_1.35.2 by:

library(devtools)
devtools::install_github("hukai916/CRISPRseek")

I also created a Docker image with relevant Python environment installed just in case. Use the following commands to use the container interactively (assuming on a Mac):

docker pull hukai916/r_dev_4.2.0_xenial:0.3
docker run -v /Users:/Users -it hukai916/r_dev_4.2.0_xenial:0.3

If you want to manually fix it, you can do:

# first, locate the Rule_Set_2 python script with R:
pythonScript <- system.file("extdata/Rule_Set_2_scoring_v1/analysis/rs2_score_calculator.py",package = "CRISPRseek")
pythonScript

# then, open rs2_score_calculator.py in an editor and modify the last two lines to:
    else:
        print('Rule set 2 score: NA')
        print >> sys.stderr, 'Calculates on-target scores for sgRNAs with NGG PAM only.'

Hope it helps.

ADD COMMENT
0
Entering edit mode

For you second question, it is confusing that your ">KLHL17_chr1_896281_896380" sequence differs from what in your test.fa file though they have the same name line.

For your first "KLHL17_chr1_896281_896380", there is no significant BLAST hit in human genome, which is consistent with that no ontargets were detected.

For your second "KLHL17_chr1_896281_896380", it actually comes from human genome chr1. Therefore, I rerun the analysis with chromToSearch = c("chr1"), and got many ontargets. I suspect that you used chromToSearch = c("chr21") that leads to the ontargets being 0.

When you set calculategRNAefficacyForOfftargets = F, no efficacy scores will be calculated if your ontargets is 0.

Hope it helps.

ADD REPLY
0
Entering edit mode

Hello Kai,

Thanks a lot, using the dev version (CRISPRseek 1.35.2) fixed the issue, it now creates the Summary.xls that contains the grna on-target efficiency as well as CFD score.

Regarding the 2nd question, you are right, it was a mistake from my part.

Best, Miguel

ADD REPLY

Login before adding your answer.

Traffic: 482 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6