Question

Extreme log2 FC from DESeq2

0

Entering edit mode

wbliu • 0

@wbliu-14583

Last seen 6.4 years ago

Hi there,

I've run DESeq2 on an RNA-seq data sets, with the count data came from HTSeq count. The DESeq() call was all by default. I got some very low log2 fold changes. Two genes are posted below:

Data (normalized counts):

        1.R 2.R 3.R 4.R 5.R 6.R 7.R 8.R 1.NR 2.NR 3.NR 4.NR 
Gene1    0  0  0.0   0  0  0   0   0   0     0   40.6  0.0
Gene2    0  0 256.6  0   0   0   0   0   0    0  122.2  1.6
          5.NR    6.NR    7.NR  8.NR  9.NR
Gene1     130.1   94.3      0    0.0    0.0
Gene2     0.0    475.7      0    0.4    0.8

Result:

      baseMean log2FoldChange     lfcSE      stat       pvalue         padj
        <numeric>      <numeric> <numeric> <numeric>    <numeric>    <numeric>
Gene1 15.589483      -22.87922  2.649630 -8.634875 5.879134e-18 9.567115e-14
Gene2   7.355824      -21.86646  2.792237 -7.831161 4.833841e-15 3.933055e-11

My question is, should they be kept as real hits? Since the data are rather sparse and have high within-group variation/dispersion, why were they not filtered out by the cook's distance or independent filtering?

versions: DESeq2_1.18.1, R_3.4.2.

Any explanation will be greatly appreciated!

rnaseq deseq2 • 535 views

ADD COMMENT • link updated 6.4 years ago by Michael Love 41k • written 6.4 years ago by wbliu • 0

score 0 · Answer 1 · 2017-12-07

It's hard to say with just two genes why you get a certain t-statistic. The first one seems clear that you have 3/9 samples with positive counts and outside the single digit range, compared to 9/9 with zero, so it makes sense that it ends up with a small p-value. The second one looks like not promising, but it's hard to say why it gets a small p-value. I have noticed that with data that has -- within a single condition -- many zeroes mixed with large-ish counts, like >200, that using the LRT produces more conservative behavior than the Wald test. The LRT for a two group comparison can be run with test="LRT", reduced~1 when you run DESeq().