Search
Question: Extreme log2 FC from DESeq2
0
gravatar for wbliu
5 days ago by
wbliu0
wbliu0 wrote:

Hi there,

I've run DESeq2 on an RNA-seq data sets, with the count data came from HTSeq count. The DESeq() call was all by default.  I got some very low log2 fold changes. Two genes  are posted below:

Data (normalized counts):

        1.R 2.R 3.R 4.R 5.R 6.R 7.R 8.R 1.NR 2.NR 3.NR 4.NR 
Gene1    0  0  0.0   0  0  0   0   0   0     0   40.6  0.0
Gene2    0  0 256.6  0   0   0   0   0   0    0  122.2  1.6
          5.NR    6.NR    7.NR  8.NR  9.NR
Gene1     130.1   94.3      0    0.0    0.0
Gene2     0.0    475.7      0    0.4    0.8

 Result:

      baseMean log2FoldChange     lfcSE      stat       pvalue         padj
        <numeric>      <numeric> <numeric> <numeric>    <numeric>    <numeric>
Gene1 15.589483      -22.87922  2.649630 -8.634875 5.879134e-18 9.567115e-14
Gene2   7.355824      -21.86646  2.792237 -7.831161 4.833841e-15 3.933055e-11

My question is, should they be kept as real hits? Since the data are rather sparse and have high within-group variation/dispersion, why were they not filtered out by the cook's distance or independent filtering?

versions:  DESeq2_1.18.1, R_3.4.2.

Any explanation will be greatly appreciated!

ADD COMMENTlink modified 5 days ago by Michael Love15k • written 5 days ago by wbliu0
0
gravatar for Michael Love
5 days ago by
Michael Love15k
United States
Michael Love15k wrote:

It's hard to say with just two genes why you get a certain t-statistic. The first one seems clear that you have 3/9 samples with positive counts and outside the single digit range, compared to 9/9 with zero, so it makes sense that it ends up with a small p-value. The second one looks like not promising, but it's hard to say why it gets a small p-value. I have noticed that with data that has -- within a single condition -- many zeroes mixed with large-ish counts, like >200, that using the LRT produces more conservative behavior than the Wald test. The LRT for a two group comparison can be run with test="LRT", reduced~1 when you run DESeq().

ADD COMMENTlink written 5 days ago by Michael Love15k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 418 users visited in the last hour