Hi All,
I am analyzing RNASeq data from 12 samples. I did the alignment and the count using Rsubread package. For differential expression genes I am using DESeq2, but it return 41% low count genes and only 71 significant.
>info<data.frame(condition=c(rep("TOX",6),rep("NONTOX",6)),DOX=c(rep("untreated",3),rep("treated",3),rep("untreated",3),rep("treated",3)))
>rownames(info) <- colnames(counts)
>info
condition DOX
RARG_1_0uM TOX untreated
RARG_2_0uM TOX untreated
RARG_3_0uM TOX untreated
RARG_1_1uM TOX treated
RARG_2_1uM TOX treated
RARG_3_1uM TOX treated
WT_1_0uM NONTOX untreated
WT_2_0uM NONTOX untreated
WT_3_0uM NONTOX untreated
WT_1_1uM NONTOX treated
WT_2_1uM NONTOX treated
WT_3_1uM NONTOX treated
>dds <- DESeqDataSetFromMatrix(countData=counts, colData=info,design=~DOX+condition)
> levels(dds$DOX)
[1] "untreated" "treated"
> levels(dds$condition)
[1] "NONTOX" "TOX"
>dds <-DESeq(dds)
> summary(res)
out of 40911 with nonzero total read count
adjusted p-value < 0.1
LFC > 0 (up) : 48, 0.12%
LFC < 0 (down) : 23, 0.056%
outliers [1] : 550, 1.3%
low counts [2] : 19079, 47%
(mean count < 5)
[1] see 'cooksCutoff' argument of ?results
[2] see 'independentFiltering' argument of ?results
Is this high % of low counts normal?
Yes, by definition increasing the depth will decrease the number of low count genes.
But that doesn't mean you should necessarily increase the sequencing depth.
You have 40911 * 0.53 genes with sufficient depth...