Choosing a threshold for minimum counts in RNAseq
3
0
Entering edit mode
@laianavarromartin-7750
Last seen 7.1 years ago
Spain

Hi,

I was wondering which is the criteria to set the minimum counts that a gene is considered for further analysis when doing DE between control and treated samples. Is there any way to change this threshold to be more stringent? One I analyze my data set, if I use the default setting I get 1400 genes that are DE. However if I delete all genes that have an average count < 10 from the count matrix I do only get 300 DE genes. Is there an arbitrary way to select the genes that have a reasonable minimun counts?

Thanks!

Laia

rnaseq deseq2 • 8.4k views
ADD COMMENT
3
Entering edit mode
@ryan-c-thompson-5618
Last seen 9 weeks ago
Icahn School of Medicine at Mount Sinai…

I generally look at a histogram of average logCPM values. Typically this is a bimodal distribution, with a low-CPM peak representing non-expressed genes and and a high-CPM peak representing expressed genes. I choose an appropriate filtering threshold between the two peaks. You can see an example in this document: https://cdn.rawgit.com/DarwinAwardWinner/resume/master/examples/Salomon/Teaching/RNA-Seq%20Lab.html. (See the section "Filtering non-expressed genes")

ADD COMMENT
2
Entering edit mode
@mikelove
Last seen 19 hours ago
United States

Filtering on the mean of normalized counts to obtain an optimal threshold is performed automatically in DESeq2 within the results function.

This is discussed in the vignette. Best to take a look over the vignette, as most user questions have already been addressed there:

vignette("DESeq2")
ADD COMMENT
0
Entering edit mode
Björn • 0
@bjorn-12199
Last seen 5.5 years ago
CH

Hi Ryan, your website is really informative. However, it is not clear how to choose CPM value 

ADD COMMENT
0
Entering edit mode

As I said in my answer, I choose a threshold that lies between the low-logCPM mode and the high-logCPM mode in the logCPM histogram plot. Generally the precise choice of threshold is not important, as long as you choose one in the trough between the two modes.

ADD REPLY

Login before adding your answer.

Traffic: 793 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6