Question

How to use the QUIDE-seq R/python package to remove non-genomic DNA sequences?

0

Entering edit mode

mousheng xu ▴ 10

@mousheng-xu-2280

Last seen 6.5 years ago

We used some methods very similar to what had been described in Tsai's 2015 GUIDE-seq Nature paper: break DNA, insert barcode, ligate together, PCR amplification. The major if not only difference is that we used radioactivity instead of CRISPR-cas9 to generate DNA double strand breaks.

I tried the R bioconductor package "GUIDEseq" and the python GUIDE-seq package, and have questions about both methods:

* the R GUIDEseq package: successfully installed, but it requires .bed & .bam files to start with, while all we have are .fastq raw dump from HiSeq.

* the python GUIDE-seq: successfully installed, but when running cutdapt.py, I got the following error:

------------------------------ START -----------------------------------

Traceback (most recent call last):

File "cutadapt.py", line 61, in <module>

from cutadapt import check_importability

File "/ark/home/mx010/.local/lib/python2.7/site-packages/cutadapt/scripts/cutadapt.py", line 61, in <module>

from cutadapt import check_importability

ImportError: cannot import name check_importability

------------------------------ END -----------------------------------

What should I do? What are the functional differences between the R GUIDEseq package and the python GUIDE-seq? It seems that they are doing different things.

Thanks!

-- Mo

annotation software software error • 2.1k views

ADD COMMENT • link updated 7.8 years ago by Julie Zhu ★ 4.3k • written 7.9 years ago by mousheng xu ▴ 10

score 0 · Answer 1 · 2016-06-02

http://mccb.umassmed.edu/GUIDE-seq/readme.txt has a step by step guide to generate bam file and umi file for the bioconductor package GUIDEseq . All the preprocessing code is available at http://mccb.umassmed.edu/GUIDE-seq/<http: mccb.umassmed.edu="" guide-seq="" readme.txt="">. Best, Julie Sent from my iPhone On Jun 2, 2016, at 3:05 PM, mousheng xu [bioc] <noreply@bioconductor.org<mailto:noreply@bioconductor.org>> wrote: Activity on a post you are following on support.bioconductor.org<https: support.bioconductor.org=""> User mousheng xu<https: support.bioconductor.org="" u="" 2280=""/> wrote Question: How to use the QUIDE-seq R/python package to remove non-genomic DNA sequences?<https: support.bioconductor.org="" p="" 83323=""/>: We used some methods very similar to what had been described in Tsai's 2015 GUIDE-seq Nature paper: break DNA, insert barcode, ligate together, PCR amplification. The major if not only difference is that we used radioactivity instead of CRISPR-cas9 to generate DNA double strand breaks. I tried the R bioconductor package "GUIDEseq" and the python GUIDE-seq package, and have questions about both methods: * the R GUIDEseq package: successfully installed, but it requires .bed & .bam files to start with, while all we have are .fastq raw dump from HiSeq. * the python GUIDE-seq: successfully installed, but when running cutdapt.py, I got the following error: ------------------------------ START ----------------------------------- Traceback (most recent call last): File "cutadapt.py", line 61, in <module> from cutadapt import check_importability File "/ark/home/mx010/.local/lib/python2.7/site-packages/cutadapt/scripts/cutadapt.py", line 61, in <module> from cutadapt import check_importability ImportError: cannot import name check_importability ------------------------------ END ----------------------------------- What should I do? What are the functional differences between the R GUIDEseq package and the python GUIDE-seq? It seems that they are doing different things. Thanks! -- Mo ________________________________ Post tags: annotation, software, software error You may reply via email or visit How to use the QUIDE-seq R/python package to remove non-genomic DNA sequences?

score 0 · Answer 2 · 2016-06-17

0

Entering edit mode

mousheng xu ▴ 10

@mousheng-xu-2280

Last seen 6.5 years ago

Step1. Bin barcode
./binReads.sh fastqFolder barcodes 1 8 16 p7.index p5.index usedBarcodes
where fastqFolder contains the fastq files and barcodes is the barcode index.

Q: what is "barcodes" and what is "usedBarcodes"?

Are they file names? If so, how do they look like inside?

Thanks!

ADD COMMENT • link 7.8 years ago mousheng xu ▴ 10

0

Entering edit mode

Mousheng, barcodes is the barcode index which can be downloaded at http://mccb.umassmed.edu/GUIDEseq/barcodes.bowtie1.index.tar.gz usedBarcodes is the name of the barcode file generated by the script automatically (you do not need to do anything about it) Best, Julie Confidentiality Notice: This e-mail message, including any attachments, is for the sole use of the intended recipient(s) and may contain confidential, proprietary and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender immediately and destroy or permanently delete all copies of the original message. From: "mousheng xu [bioc]" <noreply@bioconductor.org<mailto:noreply@bioconductor.org>> Reply-To: "reply+ba0c085b+code@bioconductor.org<mailto:reply+ba0c085b+code@bioconductor.org>" <reply+ba0c085b+code@bioconductor.org<mailto:reply+ba0c085b+code@bioconductor.org>> Date: Friday, June 17, 2016 1:15 PM To: Lihua Julie Zhu <julie.zhu@umassmed.edu<mailto:julie.zhu@umassmed.edu>> Subject: [bioc] A: How to use the QUIDE-seq R/python package to remove non-genomic DNA sequences? Activity on a post you are following on support.bioconductor.org<https: support.bioconductor.org=""> User mousheng xu<https: support.bioconductor.org="" u="" 2280=""/> wrote Answer: How to use the QUIDE-seq R/python package to remove non-genomic DNA sequences?<https: support.bioconductor.org="" p="" 83323="" #83992="">: Step1. Bin barcode ./binReads.sh fastqFolder barcodes 1 8 16 p7.index p5.index usedBarcodes where fastqFolder contains the fastq files and barcodes is the barcode index. Q: what is "barcodes" and what is "usedBarcodes"? Are they file names? If so, how do they look like inside? Thanks! ________________________________ Post tags: annotation, software, software error You may reply via email or visit A: How to use the QUIDE-seq R/python package to remove non-genomic DNA sequences?

ADD REPLY • link 7.8 years ago Julie Zhu ★ 4.3k

score 0 · Answer 3 · 2016-06-21

0

Entering edit mode

mousheng xu ▴ 10

@mousheng-xu-2280

Last seen 6.5 years ago

Hi Julie,

Thanks for the clarification.

The link for the gzip file http://mccb.umassmed.edu/GUIDEseq/barcodes.bowtie1.index.tar.gz is not valid.

Thanks again,

Mousheng Xu

ADD COMMENT • link 7.8 years ago mousheng xu ▴ 10

0

Entering edit mode

Mousheng, Here is the correct link (there is a - between GUIDE and seq) http://mccb.umassmed.edu/GUIDE�seq/barcodes.bowtie1.index.tar.gz<http: mccb.umassmed.edu="" guideseq="" barcodes.bowtie1.index.tar.gz=""> Best, Julie Confidentiality Notice: This e-mail message, including any attachments, is for the sole use of the intended recipient(s) and may contain confidential, proprietary and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender immediately and destroy or permanently delete all copies of the original message. From: "mousheng xu [bioc]" <noreply@bioconductor.org<mailto:noreply@bioconductor.org>> Reply-To: "reply+09696dd5+code@bioconductor.org<mailto:reply+09696dd5+code@bioconductor.org>" <reply+09696dd5+code@bioconductor.org<mailto:reply+09696dd5+code@bioconductor.org>> Date: Tuesday, June 21, 2016 2:29 PM To: Lihua Julie Zhu <julie.zhu@umassmed.edu<mailto:julie.zhu@umassmed.edu>> Subject: [bioc] A: How to use the QUIDE-seq R/python package to remove non-genomic DNA sequences? Activity on a post you are following on support.bioconductor.org<https: support.bioconductor.org=""> User mousheng xu<https: support.bioconductor.org="" u="" 2280=""/> wrote Answer: How to use the QUIDE-seq R/python package to remove non-genomic DNA sequences?<https: support.bioconductor.org="" p="" 83323="" #84133="">: Hi Julie, Thanks for the clarification. The link for the gzip file http://mccb.umassmed.edu/GUIDEseq/barcodes.bowtie1.index.tar.gz is not valid. Thanks again, Mousheng Xu ________________________________ Post tags: annotation, software, software error You may reply via email or visit A: How to use the QUIDE-seq R/python package to remove non-genomic DNA sequences?

ADD REPLY • link 7.8 years ago Julie Zhu ★ 4.3k

0

Entering edit mode

Mousheng,

Here is the correct link.

http://mccb.umassmed.edu/GUIDE-seq/barcode.bowtie1.index.tar.gz

Best,

Julie

ADD REPLY • link 7.8 years ago Julie Zhu ★ 4.3k

score 0 · Answer 4 · 2016-06-22

0

Entering edit mode

mousheng xu ▴ 10

@mousheng-xu-2280

Last seen 6.5 years ago

Hi Julie,

Got the gz package and unzipped it. It contains multiple files. So, still, how to run binReads.sh?

./binReads.sh fastqFolder barcodes 1 8 16 p7.index p5.index usedBarcodes

What should be in the place of "barcodes" and what do 1 8 16 mean?

Thank you!

-- Mousheng

ADD COMMENT • link 7.8 years ago mousheng xu ▴ 10

score 0 · Answer 5 · 2016-06-22

Mousheng,

If you are following standard protocol of GUIDE-seq, you will not need to change these parameters. Please look at binReads.sh file to see the parameters and what programs you need.

16 means 16bp barcodes (8 for p7 index and 8 for p5 index)

8 means (mapping with 8 threads)

1 allowing 1 mismatch.

The input fastq file should look like the one at http://mccb.umassmed.edu/GUIDE-seq/testGetUmi/testGetUmi.fastq

Please note that you need to have bowtie1 and R installed for this step, bowtie2 installed for mapping to the genome. If you are running batch job for Platform LSF, you do not need to modify the script. Otherwise, please make sure to change the "module load" command in binReads.sh accordingly.

Best regards,

Julie