I have read the edgeR User's Guide, and Section 2.6 recommends setting a threshold of at least 5-10 absolute counts in a library to be considered expressed. If I set 10 counts as my threshold for expression, my calculated cpm becomes 0.8 based on the smallest library size of my sample set:
group lib.size norm.factors
SM01 T0 12559041 1
0.8 = ((10 x 1,000,000)/12559041) Is this correct? Am I being too stringent using 0.8 instead of 1?
Also, in what instances would the recommended 5-10 counts change (+ or -)?
First, your calculation looks fine to me. You could probably just use a threshold of 1 instead, it's not that much of a difference. But if you've taken the time to work it out, you might as well use 0.8.
The 5-10 recommendation seems to do well in a variety of situations (for routine RNA-seq, at least). The issue is that, at counts lower than 5, we get problems with discreteness and some statistical approximations become inaccurate. On the other hand, we don't want to increase the filter beyond 10, because we might start filtering out interesting genes. So the 5-10 choice represents a compromise between these two considerations.
Lower thresholds are sometimes used when you're explicitly interested in low-abundance genes, e.g., certain ncRNAs, repeat elements that don't get a lot of reads. Higher thresholds are used in other applications like ChIP-seq, where there is a certain level of background enrichment and we need to set the filter above that background. This gets rid of uninteresting genomic regions, even if they have large read counts.