Question: p-values smaller than machine epsilon in DESeq2
0
gravatar for oscarf
8 months ago by
oscarf0
oscarf0 wrote:

Hi,

our group has been using the function DESeq of the library DESeq2 to perform differential expression analysis. We notice than some features get a p-value much smaller than the machine epsilon. There are two points that we find difficult to understand.

  1.  How are these functions generating the p-values, especially those that are smaller than the epsilon of the machine?

  2.  How meaningful is a comparison of p-values that are negligibly small, say 10^(-250) and 10^(-251)?

We would deeply appreciate any help for the understanding of these questions.

O.F.

deseq2 pvalue • 212 views
ADD COMMENTlink modified 8 months ago by Michael Love24k • written 8 months ago by oscarf0

Here's a link to a similar thread about edgeR:

p -values smaller than machine epsilon in edgeR

ADD REPLYlink written 8 months ago by Michael Love24k
Answer: p-values smaller than machine epsilon in DESeq2
0
gravatar for Michael Love
8 months ago by
Michael Love24k
United States
Michael Love24k wrote:

I wouldn’t make a big difference between very small pvalues, essentially the data is not consistent with a null model, eg many samples with 0 vs many samples with very high counts will give a very small pvalue. Just consider it a rejectable set of genes at an FDR that you specify. Our work (DESeq2 and now apeglm) focuses a lot on effect size estimation, and LFC thresholds greater than 0. Point null rejection as we argue is sometimes a trivial hurdle to pass over for many genes in well powered gene expression studies.

The machine epsilon point I don’t really see what you’re getting at. We can calculate tail probabilities of a distribution.

ADD COMMENTlink modified 8 months ago • written 8 months ago by Michael Love24k

Or, to answer the second question very directly: Not at all.

A very low p value means that either model asumption is wrong or the evidence against the null hypothesis with almost certainty. There is little point in seeing different levels between being almost certain.

This is why I usually recommend to use p values only as a cut point: Once you have decided on your significance threshold, cut your result list there, then forget about the p values and sort by (shrunken!) fold change to find the most interesting hits.

ADD REPLYlink written 8 months ago by Simon Anders3.5k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 349 users visited in the last hour