CRISPRseek for Arabidopsis
4
0
Entering edit mode
@jeremyatwsu-7076
Last seen 9.3 years ago
United States

Hi folks,
 
I am a new post-doc at Washington State University.  I don’t have a whole lot of bioinformatics experience, nevertheless I am trying to use CRISPRseek on the Arabidopsis genome.
 
Using a small test set (two genes), I can identify gRNA candidates, but there is a problem with the OfftargetAnalysis.xls in that the exon location is not being annotated and neither is the gene id.
 
What mistake am I making?
 
> library(CRISPRseek)
> library(BSgenome.Athaliana.TAIR.TAIR9)
> library(TxDb.Athaliana.BioMart.plantsmart22)
> library(org.At.tair.db)
> outputDir <- getwd()
> inputFilePath <- system.file('extdata', 'TAIR10_cds_20110103_representative_gene_model_updated.fa', package = 'CRISPRseek')
> REpatternFile <- system.file('extdata', 'NEBenzymes.fa', package = 'CRISPRseek')
> results <- offTargetAnalysis(inputFilePath, findgRNAsWithREcutOnly = FALSE, REpatternFile = REpatternFile, findPairedgRNAOnly = FALSE, BSgenomeName = Athaliana, chromToSearch = "all", txdb = TxDb.Athaliana.BioMart.plantsmart22, orgAnn = org.At.tair.db, max.mismatch = 4, outputDir = outputDir, overwrite = TRUE)
The main thing I want to see is whether the off-target is exonic and the Arabidopsis locus ID (i.e. something looking like AT1G11110) for each off-target hit.
 
 
Thank you for any insight you can give me.

crisprseek arabidopsis thaliana • 2.2k views
ADD COMMENT
1
Entering edit mode
Julie Zhu ★ 4.3k
@julie-zhu-3596
Last seen 4 months ago
United States

Jeremy,

It turns out that the <font size="2">TxDb.Athaliana.BioMart.plantsmart22 has chromosome labeled as 1, 2, 3 </font>…<font size="2"> instead of Chr1, Chr2, Chr3, …</font>

<font size="2">Please create your own TxDb object as following. I tested it and it worked.</font>

# the gff file can be downloaded at ftp://ftp.arabidopsis.org/home/tair/Genes/TAIR10_genome_release/TAIR10_gff3 

txdb <- makeTranscriptDbFromGFF("TAIR10_GFF3_genes.gff",format="gff")

Please remember to set txdb = txdb in the offTargetAnalysis call, as highlighted in the following code. In addition, for final output, you probably want to set max.mismatch to 3 instead of 1. For testing, 1 is good.

setwd("~/test/")
outputDir <- getwd()
inputFilePath <- 'test.fa'

results <- offTargetAnalysis(inputFilePath, findgRNAsWithREcutOnly = FALSE,  findPairedgRNAOnly = FALSE, BSgenomeName = Athaliana, chromToSearch = "all", txdb = txdb, orgAnn = org.At.tairSYMBOL, max.mismatch = 1, outputDir = outputDir, overwrite = TRUE)

Please let me know how it works out for you. Thanks!

Best regards,

Julie

ADD COMMENT
0
Entering edit mode

That is great: it works!  Thank you so much for your help!

 

ADD REPLY
0
Entering edit mode
Julie Zhu ★ 4.3k
@julie-zhu-3596
Last seen 4 months ago
United States

Jeremy,

Could you please change orgAnn = org.At.tairSYMBOL? Thanks!

 results <- offTargetAnalysis(inputFilePath, findgRNAsWithREcutOnly = FALSE, REpatternFile = REpatternFile, findPairedgRNAOnly = FALSE, BSgenomeName = Athaliana, chromToSearch = "all", txdb = TxDb.Athaliana.BioMart.plantsmart22, orgAnn = org.At.tairSYMBOL, max.mismatch = 4, outputDir = outputDir, overwrite = TRUE)

Best regards,

Julie

ADD COMMENT
0
Entering edit mode

Shucks.  That didn't work.  Still no entries in exon/intron columns nor in entrez_id column.

I tried org.At.tairREFSEQ and org.At.tairENTREZID as well.

Any other ideas?  

 

ADD REPLY
0
Entering edit mode
Julie Zhu ★ 4.3k
@julie-zhu-3596
Last seen 4 months ago
United States

Could you please try the most recent version CRISPRseek_1.5.5.tgz at http://bioconductor.org/packages/devel/bioc/html/CRISPRseek.html? If you still encounter problems, please send me the fasta file and code? Thanks!

Best,

Julie

ADD COMMENT
0
Entering edit mode

Still no change.

Here is the fasta:

>AT1G67865
ATGCTTGATAAGCTTATTATTGGGGTTGCCGGAGGGATTACCGGAGGGATTCTTGGAACGGTCGATGGGTTTGCCAAAGGGGTCGGGATATGGCCCAATAATTATCAGAGCACCGGTCGCTTCGAGAACAACAATATGACGAGTCCGGGAAACTACGGCAATGGTAATGGCGGTGCAGTGAAGGCATCCGAGAACTCCGGCGGCCGCCGCCAGAAAGACAGGGAGTAA

>AT1G67860
ATGCTTGATACGCTTATTGGAGGGATTGTCGGAGGGATTGCCGGAGCGATTATTGGAACGGTGGATGGGTTCGCCAGAGGGATCGGAATATGCCCCGATAGTTACCAGAGCTGCACTCGTACCGACTGCGAGGAGCACAAAAAGAAGCTCCCGACCAACCTTAGCCGTAACGGCGGTGCAGCAGCAGTGAAGGCTAAGGAGAACGGCCGCCGTCGCCGCCAGAAAGACAGGGAGTAG

 

Here is what I have for code:

source("http://bioconductor.org/biocLite.R")

biocLite("BSgenome.Athaliana.TAIR.TAIR9")
biocLite("TxDb.Athaliana.BioMart.plantsmart22")
biocLite("org.At.tair.db")

 

 

library(CRISPRseek)
library(BSgenome.Athaliana.TAIR.TAIR9)
library(TxDb.Athaliana.BioMart.plantsmart22)
library(org.At.tair.db)


outputDir <- getwd()
inputFilePath <- system.file('extdata', 'test.fa', package = 'CRISPRseek')
REpatternFile <- system.file('extdata', 'NEBenzymes.fa', package = 'CRISPRseek')

results <- offTargetAnalysis(inputFilePath, findgRNAsWithREcutOnly = FALSE, REpatternFile = REpatternFile, findPairedgRNAOnly = FALSE, BSgenomeName = Athaliana, chromToSearch = "all", txdb = TxDb.Athaliana.BioMart.plantsmart22, orgAnn = org.At.tairSYMBOL, max.mismatch = 1, outputDir = outputDir, overwrite = TRUE)

 

ADD REPLY
0
Entering edit mode
Julie Zhu ★ 4.3k
@julie-zhu-3596
Last seen 4 months ago
United States

Jeremy,

I noticed that in your code, you set the inputFilePath to the test file included in the package, which is different from the test.fa you intended to use for your analysis.

Instead of set inputFilePath as the following,

inputFilePath <- system.file('extdata', 'test.fa', package = 'CRISPRseek')

You would first set your working directory to the file path where your test.fa is located, for example ~/testCRISPRseek/. 

setwd("the file path where your test.fa is located")

Then set inputFilePath as

inputFilePath <- 'test.fa'

Best,

Julie

ADD COMMENT
0
Entering edit mode

Hi Julie,

 

I have test.fa in the same directory as inputseq.fa (the test file included with the package).

The output indicates that gRNAs were discovered in my two test sequences in test.fa (AT1G67865 and AT1G67860).

So it doesn't seem to be a problem of using my test.fa.  The problem is that gene name of offtargets are not included in the output, nor is exon/intron location.

ADD REPLY
0
Entering edit mode

Not sure of these four warnings are useful to help diagnose the problem:

 

1: In dir.create(outputDir) :
  'C:\Users\browse.lab\Documents' already exists
2: In dir.create(outputDir) :
  'C:\Users\browse.lab\Documents' already exists
3: In .Seqinfo.mergexy(x, y) :
  The 2 combined objects have no sequence levels in common. (Use
  suppressWarnings() to suppress this warning.)
4: In .Seqinfo.mergexy(x, y) :
  The 2 combined objects have no sequence levels in common. (Use
  suppressWarnings() to suppress this warning.)

ADD REPLY

Login before adding your answer.

Traffic: 689 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6