Question

DSS understanding delta option and posterior probability

0

Entering edit mode

stephanie.rialle • 0

@stephanierialle-18763

Last seen 2.8 years ago

France

Dear DSS developpers,

First, thank you for your software which is very useful for methylation analysis.

I have a question regarding the application of a delta threshold in callDML or callDMR functions. I have read your explanations in the vignette and help of the functions but I still don't fully understand the concept. I must say that I only have very basic notions in bayesian statistics :p. I don't really understand why the application of a delta threshold give different results than no delta applied but a simple selection of sites with methylation difference above the same threshold. For example, I have generated results for the identification of DMR, first without delta (delta=0), and I obtained 2,184 significative regions. If in these results I apply a filter on Diff.Methy > abs(0.25), I get 1,832 regions. When I apply callDMR specifiying delta=0.25, I only obtain 227 regions. I understand that a posterior probability is computed but I don't really understand why, as a statistical test has already been applied. Could you help me understand please?

Many thanks,

Regards,

Stephanie

posterior pobability statistical test theory • 996 views

ADD COMMENT • link updated 4.2 years ago by James W. MacDonald 66k • written 4.2 years ago by stephanie.rialle • 0

score 2 · Answer 1 · 2020-05-05

If you run callDMR with a delta=0, you are testing for evidence that the true population difference between your groups is larger than zero. In other words, that there is any difference at all between the groups. But if you look at the observed differences, none of them are really that close to zero! That's because there is uncertainty involved - you have a sample from a population, not the population itself - and you have to incorporate that into the test.

If you just add an additional criterion that the differences have to be > abs(0.25), that measurement applies to the sample, not the population, and you are not accounting for the uncertainty that arises from using a sample. But if you run callDMR with a delta = 0.25, you then add that additional criterion into the statistical test, and due to the uncertainty from using a sample, the observed delta will almost always be larger than 0.25.