Question: Help with using CRISPRseeker
0
3 months ago by
United States
sharvari gujja20 wrote:

Hi Julie,

I am writing to seek your expertise in using CRISPRseeker tool to evaluate the accuracy and efficiency of CRISPR experiment.

Particularly, I am looking at "Scenario 5. Target and off-target analysis for gRNAs input by user" in the manual: http://bioconductor.org/packages/release/bioc/manuals/CRISPRseek/man/CRISPRseek.pdf

To briefly describe the experimental design, I have three biological replicates of target gene KO (using CRISPR/Cas) and control (non-target) samples from human cancer cell line. Running alignment and quantification using STAR results in very low counts for target gene in control samples, and higher counts in KO samples. I would really appreciate if you could provide some advice on leveraging CRISPRseeker tool to address this situation. Also, on how to interpret metric value for gRNAefficacy - does low value indicate low on-target rate?

I really appreciate all the help. Thanks

crispr • 227 views
modified 10 weeks ago by Julie Zhu4.1k • written 3 months ago by sharvari gujja20
0
3 months ago by
Julie Zhu4.1k
United States
Julie Zhu4.1k wrote:

Hi Sharvari,

Yes, you can evaluate the efficiency/efficacy and specificity (offtargets) of your gRNA by following Scenario 5 of the user guide. Please set scoring.method = "CFDscore" for obtaining the cutting frequency determination (CFD) score for offtargets, as described in the section of Scenario 10. To obtain the gRNA efficacy using the rule set 2 published in 2016 by Root lab, please set rule.set = "RootRuleSet22016", as described in the section of Scenario 7. To search for offtargets in the whole genome, please set chromToSearch = "all", and max.mismatch = 3. Please note that the larger the max.mismatch you set, the longer time it takes to run the search. To find more information on other parameters, please type help(offTargetAnalysis) in a R session.

Your interpretation of the value of gRNA efficacy is correct, i.e., the lower the value, the lower the on-target rate.

Best regards,

Julie

Hi Julie,

Thank you for the reply. I was wondering if the input for "REpatternFile" should be the same as used in the manual? Or, is it specific to individual experiment?

Thanks Sharvari

Hi Julie,

Thank you for the reply. I was wondering if the input for "REpatternFile" should be the same as used in the manual? Or, is it specific to individual experiment?

Thanks Sharvari

Hi Julie,

Thank you for the reply. I was wondering if the input for "REpatternFile" should be the same as used in the manual? Or, is it specific to individual experiment?

Thanks Sharvari

0
11 weeks ago by
Julie Zhu4.1k
United States
Julie Zhu4.1k wrote:

Hi Sharvari, It is the same as used in the manual. Best regards, Julie

Hi Julie,

Thanks again. I am getting an error trying to run the function 'offTargetAnalysis'. Can you please help.

results <- offTargetAnalysis(inputFilePath = gRNAFilePath,
findgRNAsWithREcutOnly = FALSE, REpatternFile = REpatternFile,
scoring.method = "CFDscore",
rule.set = "Root_RuleSet2_2016",
findPairedgRNAOnly = FALSE, findgRNAs = FALSE,
BSgenomeName = Hsapiens, chromToSearch = 'all',
txdb = TxDb.Hsapiens.UCSC.hg38.knownGene,
orgAnn = org.Hs.egSYMBOL,
max.mismatch = 1, outputDir = outputDir, overwrite = TRUE)


DONE searching Building feature vectors for scoring ... Calculating scores ... Annotating, filtering and generating reports ... Python 3.7.1 Error in read.table("pythonVersion.txt", sep = "", header = FALSE) : no lines available in input Calls: offTargetAnalysis ... filterOffTarget -> calculategRNAEfficiency2 -> grep -> read.table Execution halted

Sharvari,

results <- offTargetAnalysis(inputFilePath = gRNAFilePath,

                         findgRNAsWithREcutOnly = FALSE, REpatternFile = REpatternFile,

scoring.method = "CFDscore",

findPairedgRNAOnly = FALSE, findgRNAs = FALSE,

BSgenomeName = Hsapiens, chromToSearch = 'all',

txdb = TxDb.Hsapiens.UCSC.hg38.knownGene,

orgAnn = org.Hs.egSYMBOL,

max.mismatch = 1, outputDir = outputDir, overwrite = TRUE)


If you can run the above script successfully, please try to follow the Scenario 7 at https://www.bioconductor.org/packages/release/bioc/vignettes/CRISPRseek/inst/doc/CRISPRseek.pdf Specifically, in order to use Rule set 2, first install python 2.7, then install the python packages: scikit-learn 0.16.1, pickle, pan- das, numpy nd scipy. In a R session, type the following script to use python 2.7 since Rule set 2 is implemented in python 2.7.

Sys.setenv(PATH = paste("~/anaconda2/bin", Sys.getenv("PATH"), sep=":"))

system("python --version")

# Python 2.7.15 :: Anaconda, Inc.

Now, the following script should run fine.

results <- offTargetAnalysis(inputFilePath = gRNAFilePath,

                         findgRNAsWithREcutOnly = FALSE, REpatternFile = REpatternFile,

scoring.method = "CFDscore",

rule.set = "Root_RuleSet2_2016",

findPairedgRNAOnly = FALSE, findgRNAs = FALSE,

BSgenomeName = Hsapiens, chromToSearch = 'all',

txdb = TxDb.Hsapiens.UCSC.hg38.knownGene,

orgAnn = org.Hs.egSYMBOL,

max.mismatch = 1, outputDir = outputDir, overwrite = TRUE)


Best regards,

Julie

Thanks Julie. The script works using python 2.7. However, changing max.mismatch = 3, gives me an error:

To retain the current behavior and silence the warning, pass 'sort=True'.

featNX = pandas.concat([featNX, NX_onehot], axis=1) Calculates on-target scores for sgRNAs with NGG PAM only. Calculates on-target scores for sgRNAs with NGG PAM only. Calculates on-target scores for sgRNAs with NGG PAM only. Calculates on-target scores for sgRNAs with NGG PAM only. Error in data.frame(..., check.names = FALSE) :
arguments imply differing number of rows: 16, 9

Also, as you confirmed that the lower value of gRNAefficacy indicates lower on-target rate. I was wondering if the range is 0-1? thanks Sharvari

Sharvari,

I am glad that you got it to work using python 2.7.

Could you please try to run the following script (rule set 1) with max.mismatch = 3 to see if the error goes away? results <- offTargetAnalysis(inputFilePath = gRNAFilePath, findgRNAsWithREcutOnly = FALSE, REpatternFile = REpatternFile,

                     scoring.method = "CFDscore",

findPairedgRNAOnly = FALSE, findgRNAs = FALSE,

BSgenomeName = Hsapiens, chromToSearch = 'all',

txdb = TxDb.Hsapiens.UCSC.hg38.knownGene,

orgAnn = org.Hs.egSYMBOL,

max.mismatch = 3, outputDir = outputDir, overwrite = TRUE)


Yes, the value of gRNAefficacy is between 0 (lowest efficacy) and 1 (highest efficacy).

Best regards,

Julie

0
11 weeks ago by
Julie Zhu4.1k
United States
Julie Zhu4.1k wrote:

Hi Sharvari, Just to let you know that you can set max.mismatch = 0 to speed up the analysis if you are only interested in obtaining the gRNA efficacy. The higher number you set max.mismatch, the slower the analysis takes. Best regards, Julie

0
11 weeks ago by
Julie Zhu4.1k
United States
Julie Zhu4.1k wrote:

Hi Sharvari,

it's correct to interpret the gRNAefficacy value of 0.5 as KO experiment being effective 50%. In your situation, I cannot see why there is a higher read count for KO compared to WT. I suggest you work with local bioinformatician/statistician to see if there is a better way to compare the gene expression level.

Best, Julie

library(CRISPRseek)

library("BSgenome.Hsapiens.UCSC.hg38")

library("org.Hs.eg.db")

library("TxDb.Hsapiens.UCSC.hg38.knownGene")

results <- offTargetAnalysis(inputFilePath = "test.fa",

                     findgRNAsWithREcutOnly = FALSE,

scoring.method = "CFDscore",

rule.set = "Root_RuleSet2_2016",

findPairedgRNAOnly = FALSE, findgRNAs = FALSE,

BSgenomeName = Hsapiens, chromToSearch = 'chr11',

txdb = TxDb.Hsapiens.UCSC.hg38.knownGene,

orgAnn = org.Hs.egSYMBOL,

max.mismatch = 0, outputDir = getwd(), overwrite = TRUE)


Thanks Julie. I'll download the latest version next week. Btw, is it possible to run the script using rule set 2 and max.mismatches > 3?

Hi Sharvari, Please download the Version: 1.25.5 of CRISPRseek package to run your script using rule set 2 and max.mismatches > 3. Please let me know if it does not work for you. Best, Julie

0
11 weeks ago by
Julie Zhu4.1k
United States
Julie Zhu4.1k wrote:

Sharvari,

You can either wait for a couple of days for version 1.25.5 to propagate to http://bioconductor.org/packages/devel/bioc/html/CRISPRseek.html, or get the source code using git clone. git clone https://git.bioconductor.org/packages/CRISPRseek Best, Julie

Hi Julie, Using the latest version 1.25.5, and running the script with rule set 2 and max.mismatch = 3, results in an non-zero gRNAefficacy value for the first hit (n.mismatch = 0), however for all the other hits the extendedSequence and gRNAefficacy values are all NAs. Can you please let me know how to generate non NA values for off-target analysis? Thanks.

Sharvari, This is intended since gRNA efficacy is only applicable to the on-target. Best regards, Julie

Thanks Julie. For a couple of sgRNA sequences, 'gene' and 'inExon' columns are empty.I can PM you the gRNA sequence being tested.

0
10 weeks ago by
Julie Zhu4.1k
United States
Julie Zhu4.1k wrote:

Sharvari, I have added a new parameter ignore.strand in the offtargetAnalysis function (default to TRUE). To obtain strand-specific gene annotation, set ignore.strand = FALSE. You can download Version: 1.25.7 using the following command. git clone https://git.bioconductor.org/packages/CRISPRseek Thanks! Best, Julie