CRISPRSeek, an R package for finding suitable guide RNAs (sgRNAs) for CRISPR targeting, has a lot of output files with partially-undocumented column names. These things may be obvious to a CRISPR expert, but for me, some of the columns are still a mystery.
If anyone else has observations about the ideal output of this program, I would love to have their contributions here!
I went through and collected the ones I could understand, but still had some questions:
1. Sometimes "efficiency" is used as a term, and sometimes "efficacy" is used. I assume one is a typo.
2. sgRNA efficiency: this seems to be the primary output number. It is normalized from 0 to 1, which represents __________ (?). There is a linked paper that describes this, but I did not understand the math.
From the developers: a HIGHER number (closer to 1.0) is better: The higher the gRNA efficience the better it can target the intended site (see http://www.ncbi.nlm.nih.gov/pubmed/25184501)
3. Free energy: This seems to always be a NEGATIVE number, on a different scale as above. I am not clear what this scale is, or if this number is incorporated in sgRNA efficiency?
From the developers: The lower (more negative) the free energy the more stable the structure it is. There are three columns relevant to the free energy output in the summary file. You can type ?foldgRNAs in a R session to get some sense on what they are.
- mfe.sgRNA (free energy; "probability to form secondary structure with gRNA"—should be HIGH)
- mfe.diff (should be HIGH: equal to mfe.sgRNA – mfe.backbone)
- mfe.backbone (should be LOW: "probability to form secondary structure within the backbone itself")
Summary from the developers: You would ideally want mfe.sgRNA to be high, i.e., low probability to form secondary structure with gRNA plus the backbone is low, mfe.backbone to be low, i.e.,high probability to form secondary structure within the backbone itself, and the difference mfe.diff (mfe.sgRNA - mfe.backbone) to be high (less negative).
Note: Low here means a more negative number, e.g., -10 is lower than -1.
There is also a review that is relevant to this here: http://link.springer.com/article/10.1007%2Fs11515-015-1366-y ("Overview of guide RNA design tools for CRISPR-Cas9 genome editing technology")