Hi there,
I've run DESeq2 on an RNA-seq data sets, with the count data came from HTSeq count. The DESeq() call was all by default. I got some very low log2 fold changes. Two genes are posted below:
Data (normalized counts):
1.R 2.R 3.R 4.R 5.R 6.R 7.R 8.R 1.NR 2.NR 3.NR 4.NR Gene1 0 0 0.0 0 0 0 0 0 0 0 40.6 0.0 Gene2 0 0 256.6 0 0 0 0 0 0 0 122.2 1.6 5.NR 6.NR 7.NR 8.NR 9.NR Gene1 130.1 94.3 0 0.0 0.0 Gene2 0.0 475.7 0 0.4 0.8
Result:
baseMean log2FoldChange lfcSE stat pvalue padj <numeric> <numeric> <numeric> <numeric> <numeric> <numeric> Gene1 15.589483 -22.87922 2.649630 -8.634875 5.879134e-18 9.567115e-14 Gene2 7.355824 -21.86646 2.792237 -7.831161 4.833841e-15 3.933055e-11
My question is, should they be kept as real hits? Since the data are rather sparse and have high within-group variation/dispersion, why were they not filtered out by the cook's distance or independent filtering?
versions: DESeq2_1.18.1, R_3.4.2.
Any explanation will be greatly appreciated!