Question: DESeq: filter out low count reads
0
4.8 years ago by
ccheung0
European Union
ccheung0 wrote:

Hi Everybody,

I 44 samples and >76000 transcripts in my RNA Seq data which was counted using FeatureCounts.

I'd like to reduce the number of transcripts by filtering out the transcripts in which 80% of the samples (35 samples) have >10 reads.

Can anyone suggest how I may do so?

Thanx.

carol

deseq reads filter • 1.7k views
modified 4.7 years ago • written 4.8 years ago by ccheung0
2
4.8 years ago by
Michael Love25k
United States
Michael Love25k wrote:

hi Carol,

In DESeq2, filtering of low count reads is handled automatically in the results() step. This uses software from the genefilter package on Bioconductor, to optimize the number of adjusted p-values less than a given value, say 0.1. We discuss the logic behind this here in the Independent Filtering section of the new paper:

http://genomebiology.com/2014/15/12/550

So you can either switch to DESeq2 (recommended) and this is taken care of for you, or you can use the genefilter package with DESeq.

Note: you should use gene counts, not transcript/isoform counts with DESeq or DESeq2. Search the support site for "DESeq transcript-level" for many discussion on why this is the case.

If you absolutely need to get the rows that satisfy the above criterion, it would be:

use = apply(counts(dds), 1, function(k) mean(k > 10)) > 0.8
0
4.7 years ago by
ccheung0
European Union
ccheung0 wrote:

Thank you Michael!