Question

CRISPR/RNAi screen time course analysis

0

Entering edit mode

knaxerova ▴ 10

@knaxerova-7541

Last seen 3.7 years ago

United States

Hi everyone,

I regularly analyze high throughput genetic screens with edgeR and Limma's camera function -- great stuff. Now I am wondering whether anybody has already worked through solutions for analyzing time courses. What do you think is the best way for scoring the behavior of multiple gRNAs/shRNAs per gene at multiple time points? I know that I can do time course analysis in edgeR using splines, but what would be a good way of combining p-/q-values for individual gRNAs/shRNAs into a composite statistic? I like the GSEA-style approach of camera, but the situation for a time course would obviously be more complicated, as much more information than a simple fold change is available. I am wondering whether anybody could point me to pre-existing work or suggest an approach based on edgeR/limma functions?

Thanks so much!

Kamila

edger high-throughput screening • 1.3k views

ADD COMMENT • link 7.5 years ago knaxerova ▴ 10

score 0 · Answer 1 · 2017-06-25

One approach would be to combine the p-values using Simes' method. This yields a combined p-value against the global null hypothesis, i.e., that none of the guides for a given gene have any effect. You can do this using the combineTests function in the csaw package, given the table of statistics in the output of glmLRT and a vector specifying the gene for each guide. To facilitate interpretation, I would combine the p-values from a spline model with the log-fold changes from a linear model. The former provides a more flexible fit, while the latter will give you have an idea of whether the changes are generally going up or down over time.

Note that this isn't equivalent to a gene set test. Such tests will usually ask whether a majority of guides are DE for each gene, while Simes' method can give a low combined p-value even if only one guide is DE (as long as the p-value for that guide is low enough). This may be a problem if you're worried about false positives due to off-target effects, in which case you'd like to see many guides yielding similar results - a low combined p-value from one guide would be inappropriate. However, it's not an issue if you're mostly worried about false negatives (e.g., due to guides failing to knock down their targets), in which case detection of a gene from any number of guides would be okay. Of course, having many guides with low p-values with yield a lower combined p-value than if only one guide had a low p-value, so your top-ranked hits should be fine.

As an aside, I would have used roast instead of camera to do your gene set tests. This is because camera is a competitive gene set test, i.e., the differential expression for each set of guides is evaluated relative to all other guides. For your application, this can do some rather counterintuitive things - for example, an increase in the total number of DE guides will reduce the significance of each set of guides. This doesn't seem to make much sense; why should the significance of one gene be affected by something that happens to another (unrelated) gene? In contrast, roast is self-contained; each set is evaluated separately, which avoids this problem. Indeed, roast is quite extensively used in the F1000Research paper (see the vignette at http://bioinf.wehi.edu.au/shRNAseq).

score 0 · Answer 2 · 2017-06-26

Hi Aaron,

thank you for this great answer. I will try out Simes' method. You are pointing out a real concern with the false positives: that is something I am worried about, I will have to see what the results look like in the end. If only one guide out of 5 or 10 scores, it's a bit of a red flag. My goal is to obtain a high confidence list of genes that can be used -- as an aggregate -- in downstream analyses, so perhaps the approach needs to be a little different than if I wanted to focus on one or two candidates only. I will give Simes' a whirl and will report back.

Thanks also for recommending roast -- I did not know about this method, but I will definitely try it out. Indeed it is very counterintuitive that overall significance should be dragged down by the fact that a screen worked really well!