Question: non-significant p-values from DESeq2 for some genes, visually the data looks significant
0
gravatar for RohitGarg
7 weeks ago by
RohitGarg0
RohitGarg0 wrote:

I am getting non-significant p-values for several genes that "visually" look significant. Here is one such gene.

NAME DESCRIPTION BrainVEC1 BrainVEC2 BrainVEC3 BrainNVEC1 BrainNVEC2 BrainNVEC3 ChPVEC1 ChPVEC2 ChPVEC3 ChPVEC4 ChPVEC5 ChPVEC6 ChPNVEC1 ChPNVEC2 ChPNVEC3 ChPNVEC4 ChPNVEC5 ChPNVEC6 DuraVEC1 DuraVEC2 DuraVEC3 DuraNVEC1 DuraNVEC2 DuraNVEC3 DuraLEC1 DuraLEC2 DuraLEC3 DuraLEC4 DuraLEC5 PiaVEC1 PiaVEC2 PiaVEC3 PiaNVEC1 PiaNVEC2 PiaNVEC3 ParenchymaVEC1 ParenchymaVEC2 ParenchymaVEC3 ParenchymaNVEC1 ParenchymaNVEC2 ParenchymaNVEC3 ENSMUSG00000026582 Sele 21.283583 29.655133 270.802420 4.066718 4.215113 37.817222 472.788450 149.998929 113.672655 247.821995 186.434232 495.219900 139.739530 6.182646 12.128706 8.289135 43.592426 46.515864 4786.500353 2903.643008 3419.749856 1123.774279 376.820845 645.502343 273.998142 113.368945 101.752742 1741.593662 1801.201691 165.734591 402.232505 418.380663 18.586756 2.654553 5.152745 112.342769 8.485167 2418.136018 2.525383 3.192719 1.239111

Visually the gene looks significant, but when I do a DE contrasting DuraVEC (data in bold above) vs. DURANVEC(bold italic) I get the following result:

Gene,BaseMean,Log2FC,LfcSE,Stat,Pvalue,Padj Sele,564.212036,2.372231,1.203575,1.970987,4.872537e-02,2.005588e-01

Any help is appreciated. Thank You!

deseq2 • 78 views
ADD COMMENTlink modified 7 weeks ago by Michael Love26k • written 7 weeks ago by RohitGarg0
Answer: non-significant p-values from DESeq2 for some genes, visually the data looks sig
0
gravatar for Michael Love
7 weeks ago by
Michael Love26k
United States
Michael Love26k wrote:

The pvalue here (.04) seems to me to reflect that there is some.evidence against the null but there is also only 3 samples per group and some moderate within group variance.

ADD COMMENTlink written 7 weeks ago by Michael Love26k

Hi Michael, Most of our data has 3 biological replicates with 3 technical replicates each. Some date has up to six replicates with no technicals. The technicals are collapsed prior to normalization as per DESeq2. What do you mean by "evidence against the null"? Thanks!

ADD REPLYlink modified 7 weeks ago • written 7 weeks ago by RohitGarg0

Take a look at the DESeq2 paper for full details. In brief, we compute a p-value which evaluates the probability of seeing a test statistic as large or larger if LFC=0. This particular gene and contrast gives .04 which is low. It's just not low enough. A big factor here is the variability of this gene and borrowing information from other genes. Also n=3 means that you need the differences to be much more than the variability within groups.

Here's some really simple and naive computation, but just to give an idea, the SD of the counts in each group is ~400 and ~1000. The difference in mean between the two groups is ~700. So the difference is on the scale of the SD (here, really simple and just looking at counts). Another way to think about it is to presume the observed effect size of ~1 SD is real and not due to the null. A t-test has 16% power to detect a difference of 1 SD with n=3 vs 3. You actually need n=9 to get above 50% power.

ADD REPLYlink written 7 weeks ago by Michael Love26k

Ok got it. The t-test gives only a marginally better p-value of 0.022. Thank you for your help!

ADD REPLYlink written 7 weeks ago by RohitGarg0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 283 users visited in the last hour