Choosing a threshold for minimum counts in RNAseq
3
0
Entering edit mode
@laianavarromartin-7750
Last seen 3.8 years ago
Spain

Hi,

I was wondering which is the criteria to set the minimum counts that a gene is considered for further analysis when doing DE between control and treated samples. Is there any way to change this threshold to be more stringent? One I analyze my data set, if I use the default setting I get 1400 genes that are DE. However if I delete all genes that have an average count < 10 from the count matrix I do only get 300 DE genes. Is there an arbitrary way to select the genes that have a reasonable minimun counts?

Thanks!

Laia

rnaseq deseq2 • 4.0k views
3
Entering edit mode
@ryan-c-thompson-5618
Last seen 13 months ago
Scripps Research, La Jolla, CA

I generally look at a histogram of average logCPM values. Typically this is a bimodal distribution, with a low-CPM peak representing non-expressed genes and and a high-CPM peak representing expressed genes. I choose an appropriate filtering threshold between the two peaks. You can see an example in this document: https://cdn.rawgit.com/DarwinAwardWinner/resume/master/examples/Salomon/Teaching/RNA-Seq%20Lab.html. (See the section "Filtering non-expressed genes")

2
Entering edit mode
@mikelove
Last seen 23 hours ago
United States

Filtering on the mean of normalized counts to obtain an optimal threshold is performed automatically in DESeq2 within the results function.

This is discussed in the vignette. Best to take a look over the vignette, as most user questions have already been addressed there:

vignette("DESeq2")
0
Entering edit mode
Björn • 0
@bjorn-12199
Last seen 2.3 years ago
CH

Hi Ryan, your website is really informative. However, it is not clear how to choose CPM value

0
Entering edit mode

As I said in my answer, I choose a threshold that lies between the low-logCPM mode and the high-logCPM mode in the logCPM histogram plot. Generally the precise choice of threshold is not important, as long as you choose one in the trough between the two modes.