Question

Using edgeR to analyze Cripsr/Cas9 Screening data

0

Entering edit mode

Assa Yeroslaviz ★ 1.5k

@assa-yeroslaviz-1597

Last seen 4 months ago

Germany

I am having trouble understanding the workflow described in the paper. I have a data set of four fastq files from a crispr/cas9 screening experiment as well as a fasta file of the sgRNA used in the analysis (for an example see below). The experiment uses single-indexing strategy with two control samples and two treated samples.

I am not sure how to use the workflow described in the paper and the R vignette to analyze my data. I have read both the paper and the vignette, but I still don't understand how to adapt my data to the example in there.

I know, I need to trim the fastq files to contain only the sgRNA part of the reads.

What I don;t understnad are the two text files for the `processAmplicons()` function. Where do I get the `Samples4.txt` and the `sgRNAs4.txt`. What is the third column in the second file? is it already the counts?

I would appreciate the suggestions on how to proceed.

thanks

Assa

The fastq files withe the complete read (sgRNA is highlighted)

ctrl1
@M01100:33:000000000-A9U6C:1:1101:12325:1758 1:N:0:1
CTTCTTTCTTGTGGAAAGGACGAAACACCGGTGGGCTGCAAATCCAAGAGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAACTT
+
1>A>11BDFF1DB1B111A1B000AACCE?0A///BE//011BBGA110AFFHHFFBBGHFF1BFFGFFFBBGGD2@FFGB11FGFAGGG?GH/?222BB@
@M01100:33:000000000-A9U6C:1:1101:17009:1766 1:N:0:1
GCCCCTTCTTGTGGAAAGGACGAAACACCGAGGGATGTTATCTCCTCCGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGCTAGTCCGTTATCACCTT
+
1111>>FFFF1F11111111B000AAEFE?00///B/BD2FEGGBABE/AEEGG2F11EG1B111D1@F110FF@221B11110B10FE@/FG/?B220BB
@M01100:33:000000000-A9U6C:1:1101:13526:1767 1:N:0:1
CCTTAGTCTTGTGGAAAGGACGAAACACCGCGCGCGCGGCGCCCACAGTTTAGAGCTAGAAATAGCAAGTTAAAATAGGCTAGTCCGTTATCAACTTGAA
+
11AA11BDFF1F11ADB111BA00EEFHE?00/AA/E/A>/>>@/?/>FGH211BEFF111BBEG11>FGFB22>BBGFFFFFGDE?GG?F>22<BB111
---------------
ctrl2
@M01100:32:000000000-AAD7V:1:1101:13689:1787 1:N:0:1
CCAGGTTCTTGTGGAAAGGACGAAACACCGTCCCGAAGGCTCCTCACCGGTTTTAGAGTAGAAATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAACTTG
+
>>111BDFFFFFB1111111B00AEABFFAGCEE/////ABECGF1AF/?EEGGBF1EF2F2EGGBFHF@FHHFHHGGHHHHGHFHHHHGHHGHHHDHHHH
@M01100:32:000000000-AAD7V:1:1101:14753:1826 1:N:0:1
ATGATATCTTGTGGAAAGGACGAAACACCGGCGTCGAGGAAGCGTAACTGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAACTT
+
1>11>3DFFFFFFCBDBCABFA00AAFFEE0A/EA?/EECFFEEGCCEGHHHHHGHBGGHFFFFGGGHHHHHHHHHHHHHHHHHHHHHHHGHHGHHHHHHH
@M01100:32:000000000-AAD7V:1:1101:14960:1844 1:N:0:1
GTCGCCTCTTGTGGAAAGGACGAAACACCGGAGAGCATGGCAGTACACGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAACTT
+
AAAA3AAAFFFFGCFFFFF4BA2A2ACGFE222222FGHHHHGHFHEAAEFGGGGHFHFGFHHHHGGFHHHHHHHHHHGHHHHHGHHHHHGHHGHHHHHHH

the fastA file with the sgRNA used in the library

>ENSG00000139083_GCCTGCTCAGTGTAGCATTA
gcctgctcagtgtagcatta
>ENSG00000139083_GGGAACATGAAGTGGCGTCG
gggaacatgaagtggcgtcg
>ENSG00000139083_GTGAGTGTTCGTGACCCGAG
gtgagtgttcgtgacccgag
>ENSG00000139083_GAGGAAGCGTAACTCGGCAC
gaggaagcgtaactcggcac

edger processamplicons sgrna crispr • 2.2k views

ADD COMMENT • link updated 2.9 years ago by hyejo • 0 • written 7.8 years ago by Assa Yeroslaviz ★ 1.5k

score 0 · Answer 1 · 2021-03-23

0

Entering edit mode

mridulchaudhary93 • 0

@f6510cd0

Last seen 4.8 years ago

I had the same confusion reading the edgeR user guide. I found the following helpful to decide what type of input is expected by edgeR.

http://bioinf.wehi.edu.au/shRNAseq/

http://bioinf.wehi.edu.au/shRNAseq/pooledScreenAnalysis.pdf

ADD COMMENT • link 4.8 years ago mridulchaudhary93 • 0

0

Entering edit mode

Were you able to download the demo data from the site? The link for the download doesn't work.

ADD REPLY • link 2.9 years ago hyejo • 0