Question: CRISPRseek for Arabidopsis
0
gravatar for jeremyatwsu
5.0 years ago by
United States
jeremyatwsu0 wrote:

Hi folks,
 
I am a new post-doc at Washington State University.  I don’t have a whole lot of bioinformatics experience, nevertheless I am trying to use CRISPRseek on the Arabidopsis genome.
 
Using a small test set (two genes), I can identify gRNA candidates, but there is a problem with the OfftargetAnalysis.xls in that the exon location is not being annotated and neither is the gene id.
 
What mistake am I making?
 
> library(CRISPRseek)
> library(BSgenome.Athaliana.TAIR.TAIR9)
> library(TxDb.Athaliana.BioMart.plantsmart22)
> library(org.At.tair.db)
> outputDir <- getwd()
> inputFilePath <- system.file('extdata', 'TAIR10_cds_20110103_representative_gene_model_updated.fa', package = 'CRISPRseek')
> REpatternFile <- system.file('extdata', 'NEBenzymes.fa', package = 'CRISPRseek')
> results <- offTargetAnalysis(inputFilePath, findgRNAsWithREcutOnly = FALSE, REpatternFile = REpatternFile, findPairedgRNAOnly = FALSE, BSgenomeName = Athaliana, chromToSearch = "all", txdb = TxDb.Athaliana.BioMart.plantsmart22, orgAnn = org.At.tair.db, max.mismatch = 4, outputDir = outputDir, overwrite = TRUE)
The main thing I want to see is whether the off-target is exonic and the Arabidopsis locus ID (i.e. something looking like AT1G11110) for each off-target hit.
 
 
Thank you for any insight you can give me.

ADD COMMENTlink modified 5.0 years ago by Julie Zhu4.1k • written 5.0 years ago by jeremyatwsu0
Answer: CRISPRseek for Arabidopsis
1
gravatar for Julie Zhu
5.0 years ago by
Julie Zhu4.1k
United States
Julie Zhu4.1k wrote:

Jeremy,

It turns out that the <font size="2">TxDb.Athaliana.BioMart.plantsmart22 has chromosome labeled as 1, 2, 3 </font>…<font size="2"> instead of Chr1, Chr2, Chr3, …</font>

<font size="2">Please create your own TxDb object as following. I tested it and it worked.</font>

# the gff file can be downloaded at ftp://ftp.arabidopsis.org/home/tair/Genes/TAIR10_genome_release/TAIR10_gff3 

txdb <- makeTranscriptDbFromGFF("TAIR10_GFF3_genes.gff",format="gff")

Please remember to set txdb = txdb in the offTargetAnalysis call, as highlighted in the following code. In addition, for final output, you probably want to set max.mismatch to 3 instead of 1. For testing, 1 is good.

setwd("~/test/")
outputDir <- getwd()
inputFilePath <- 'test.fa'

results <- offTargetAnalysis(inputFilePath, findgRNAsWithREcutOnly = FALSE,  findPairedgRNAOnly = FALSE, BSgenomeName = Athaliana, chromToSearch = "all", txdb = txdb, orgAnn = org.At.tairSYMBOL, max.mismatch = 1, outputDir = outputDir, overwrite = TRUE)

Please let me know how it works out for you. Thanks!

Best regards,

Julie

ADD COMMENTlink written 5.0 years ago by Julie Zhu4.1k

That is great: it works!  Thank you so much for your help!

 

ADD REPLYlink written 5.0 years ago by jeremyatwsu0
Answer: CRISPRseek for Arabidopsis
0
gravatar for Julie Zhu
5.0 years ago by
Julie Zhu4.1k
United States
Julie Zhu4.1k wrote:

Jeremy,

Could you please change orgAnn = org.At.tairSYMBOL? Thanks!

 results <- offTargetAnalysis(inputFilePath, findgRNAsWithREcutOnly = FALSE, REpatternFile = REpatternFile, findPairedgRNAOnly = FALSE, BSgenomeName = Athaliana, chromToSearch = "all", txdb = TxDb.Athaliana.BioMart.plantsmart22, orgAnn = org.At.tairSYMBOL, max.mismatch = 4, outputDir = outputDir, overwrite = TRUE)

Best regards,

Julie

ADD COMMENTlink written 5.0 years ago by Julie Zhu4.1k

Shucks.  That didn't work.  Still no entries in exon/intron columns nor in entrez_id column.

I tried org.At.tairREFSEQ and org.At.tairENTREZID as well.

Any other ideas?  

 

ADD REPLYlink written 5.0 years ago by jeremyatwsu0
Answer: CRISPRseek for Arabidopsis
0
gravatar for Julie Zhu
5.0 years ago by
Julie Zhu4.1k
United States
Julie Zhu4.1k wrote:

Could you please try the most recent version CRISPRseek_1.5.5.tgz at http://bioconductor.org/packages/devel/bioc/html/CRISPRseek.html? If you still encounter problems, please send me the fasta file and code? Thanks!

Best,

Julie

ADD COMMENTlink written 5.0 years ago by Julie Zhu4.1k

Still no change.

Here is the fasta:

>AT1G67865
ATGCTTGATAAGCTTATTATTGGGGTTGCCGGAGGGATTACCGGAGGGATTCTTGGAACGGTCGATGGGTTTGCCAAAGGGGTCGGGATATGGCCCAATAATTATCAGAGCACCGGTCGCTTCGAGAACAACAATATGACGAGTCCGGGAAACTACGGCAATGGTAATGGCGGTGCAGTGAAGGCATCCGAGAACTCCGGCGGCCGCCGCCAGAAAGACAGGGAGTAA

>AT1G67860
ATGCTTGATACGCTTATTGGAGGGATTGTCGGAGGGATTGCCGGAGCGATTATTGGAACGGTGGATGGGTTCGCCAGAGGGATCGGAATATGCCCCGATAGTTACCAGAGCTGCACTCGTACCGACTGCGAGGAGCACAAAAAGAAGCTCCCGACCAACCTTAGCCGTAACGGCGGTGCAGCAGCAGTGAAGGCTAAGGAGAACGGCCGCCGTCGCCGCCAGAAAGACAGGGAGTAG

 

Here is what I have for code:

source("http://bioconductor.org/biocLite.R")

biocLite("BSgenome.Athaliana.TAIR.TAIR9")
biocLite("TxDb.Athaliana.BioMart.plantsmart22")
biocLite("org.At.tair.db")

 

 

library(CRISPRseek)
library(BSgenome.Athaliana.TAIR.TAIR9)
library(TxDb.Athaliana.BioMart.plantsmart22)
library(org.At.tair.db)


outputDir <- getwd()
inputFilePath <- system.file('extdata', 'test.fa', package = 'CRISPRseek')
REpatternFile <- system.file('extdata', 'NEBenzymes.fa', package = 'CRISPRseek')

results <- offTargetAnalysis(inputFilePath, findgRNAsWithREcutOnly = FALSE, REpatternFile = REpatternFile, findPairedgRNAOnly = FALSE, BSgenomeName = Athaliana, chromToSearch = "all", txdb = TxDb.Athaliana.BioMart.plantsmart22, orgAnn = org.At.tairSYMBOL, max.mismatch = 1, outputDir = outputDir, overwrite = TRUE)

 

ADD REPLYlink written 5.0 years ago by jeremyatwsu0
Answer: CRISPRseek for Arabidopsis
0
gravatar for Julie Zhu
5.0 years ago by
Julie Zhu4.1k
United States
Julie Zhu4.1k wrote:

Jeremy,

I noticed that in your code, you set the inputFilePath to the test file included in the package, which is different from the test.fa you intended to use for your analysis.

Instead of set inputFilePath as the following,

inputFilePath <- system.file('extdata', 'test.fa', package = 'CRISPRseek')

You would first set your working directory to the file path where your test.fa is located, for example ~/testCRISPRseek/. 

setwd("the file path where your test.fa is located")

Then set inputFilePath as

inputFilePath <- 'test.fa'

Best,

Julie

ADD COMMENTlink written 5.0 years ago by Julie Zhu4.1k

Hi Julie,

 

I have test.fa in the same directory as inputseq.fa (the test file included with the package).

The output indicates that gRNAs were discovered in my two test sequences in test.fa (AT1G67865 and AT1G67860).

So it doesn't seem to be a problem of using my test.fa.  The problem is that gene name of offtargets are not included in the output, nor is exon/intron location.

ADD REPLYlink modified 5.0 years ago • written 5.0 years ago by jeremyatwsu0

Not sure of these four warnings are useful to help diagnose the problem:

 

1: In dir.create(outputDir) :
  'C:\Users\browse.lab\Documents' already exists
2: In dir.create(outputDir) :
  'C:\Users\browse.lab\Documents' already exists
3: In .Seqinfo.mergexy(x, y) :
  The 2 combined objects have no sequence levels in common. (Use
  suppressWarnings() to suppress this warning.)
4: In .Seqinfo.mergexy(x, y) :
  The 2 combined objects have no sequence levels in common. (Use
  suppressWarnings() to suppress this warning.)

ADD REPLYlink written 5.0 years ago by jeremyatwsu0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 235 users visited in the last hour