Question

DESeq2 analysis returns 41% low count genes

0

Entering edit mode

tarek.mohamed ▴ 10

@tarekmohamed-9489

Last seen 5.9 years ago

Hi All,

I am analyzing RNASeq data from 12 samples. I did the alignment and the count using Rsubread package. For differential expression genes I am using DESeq2, but it return 41% low count genes and only 71 significant.

>info<data.frame(condition=c(rep("TOX",6),rep("NONTOX",6)),DOX=c(rep("untreated",3),rep("treated",3),rep("untreated",3),rep("treated",3)))

>rownames(info) <- colnames(counts)

>info

condition DOX
RARG_1_0uM TOX untreated
RARG_2_0uM TOX untreated
RARG_3_0uM TOX untreated
RARG_1_1uM TOX treated
RARG_2_1uM TOX treated
RARG_3_1uM TOX treated
WT_1_0uM NONTOX untreated
WT_2_0uM NONTOX untreated
WT_3_0uM NONTOX untreated
WT_1_1uM NONTOX treated
WT_2_1uM NONTOX treated
WT_3_1uM NONTOX treated

>dds <- DESeqDataSetFromMatrix(countData=counts, colData=info,design=~DOX+condition)

> levels(dds$DOX)
[1] "untreated" "treated"

> levels(dds$condition)
[1] "NONTOX" "TOX"

>dds <-DESeq(dds)

> summary(res)

out of 40911 with nonzero total read count
adjusted p-value < 0.1
LFC > 0 (up) : 48, 0.12%
LFC < 0 (down) : 23, 0.056%
outliers [1] : 550, 1.3%
low counts [2] : 19079, 47%
(mean count < 5)
[1] see 'cooksCutoff' argument of ?results
[2] see 'independentFiltering' argument of ?results

Is this high % of low counts normal?

deseq2 rnaseq • 897 views

ADD COMMENT • link updated 8.5 years ago by Michael Love 43k • written 8.5 years ago by tarek.mohamed ▴ 10

score 0 · Answer 1 · 2016-06-14

0

Entering edit mode

Michael Love 43k

@mikelove

Last seen 1 day ago

United States

Note the mean count value that the automatic independent filtering found to be optimal:

low counts [2]   : 19079, 47% 
(mean count < 5)

This makes sense, because genes with average counts less than 5 are typically not powered enough to rise out of the sampling noise. You would need many more samples in order to find differences at such a low count.

So yes it is normal and expected to discard these low counts / low power genes before performing multiple test correction.

ADD COMMENT • link 8.5 years ago Michael Love 43k

0

Entering edit mode

Hey Michael Thanks for the reply. This rnaseq experiment was done with 30 million reads per sample, do you think that increasing the depth would decrease the low counts.

ADD REPLY • link 8.5 years ago tarek.mohamed ▴ 10

0

Entering edit mode

Yes, by definition increasing the depth will decrease the number of low count genes.

But that doesn't mean you should necessarily increase the sequencing depth.

You have 40911 * 0.53 genes with sufficient depth...

ADD REPLY • link 8.5 years ago Michael Love 43k