We are performing for the first time a shRNA screen with a pooled library. In our library, we have 7450 shRNAs targeting 668 genes. We have 34 samples in 6 different conditions - 2 time points and 4 treatments; at least 2 replicates for each condition. We are interested in the difference between early and late time points and, also, treated and non-treated samples in the late time point.
After sequencing, we performed the alignment using Bowtie and counting of the aligned reads. Now, for the analysis of this data, we would appreciate any feedback about the algorithms that have been used for this purpose (i.e. edgeR, DESeq, MAGeCK, BAGEL). The first two are the most widely used for RNAseq analysis however we are not sure that we can analyze RNAseq and shRNA screen data in the same way.
In order to do a preliminary analysis of the shRNA counts, we performed a manual normalization (as used in Shalem et al. 2014 for CRISPR screen), which is an equivalent of the total count method (TC) and does not consider the difference of sequencing depths among all the samples.
We are considering to use a more classical method of normalization, as the ones used in DESeq, or edgeR. We have seen the TC normalization in other publications for shRNA screen, so we are wondering which one is the best for this specific purpose and the pros and cons of these different methods.
Since we are not performing a genome-wide screen, we are only targeting 668 genes, can we use the models (negative binomial distribution) implemented on these tools?
We have recently seen MAGeCK and BAGEL being used to this aim, does anyone used them already and can recommend or give us some feedback?
Thank you in advance for your advice,