The distribution function for zero-altered negative binomial density produces a probability greater 1
1
0
Entering edit mode
mrk.carty ▴ 30
@mrkcarty-7442
Last seen 6.1 years ago
United States

Hi all,

I'm using the pzanegbin distribution function from the VGAM package to compute p-value for events coming from hurdle negative binomial regression null distribution. Here are details of the computation:

> library(VGAM)
> options(digits=22)
> 1 - pzanegbin(q = 480,size = 3.212259754597279748367,munb = 30.57921554406697595141  ,pobs0 = 0.04685295678266754304531 )
[1] -2.220446049250313080847e-16
> sessionInfo()
R version 3.2.1 (2015-06-18)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X 10.9.5 (Mavericks)

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] splines   stats4    stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] VGAM_0.9-8

I'm not sure why I'm getting this result...could this be a bug in the code?

Best,

Mark

VGAM pzanegbin • 578 views
1
Entering edit mode
@ryan-c-thompson-5618
Last seen 15 months ago
Scripps Research, La Jolla, CA

My guess is that there is no closed-form solution for the CDF of this non-standard distribution, so you are getting a numerical approximation, which isn't guaranteed to stay within the expected domain of [0,1]. You could use pmin(..., 1) to clamp the maximum value of the function to 1, but see below:

Edit: I just noticed that the value you get is exactly equal to -.Machine$double.eps: > .Machine$double.eps
[1] 2.220446049250313080847e-16
> 1 - pzanegbin(q = 445,size = 3.212259754597279748367,munb = 30.57921554406697595141  ,pobs0 = 0.04685295678266754304531 )
[1] -2.220446049250313080847e-16
> 1 - pzanegbin(q = 445,size = 3.212259754597279748367,munb = 30.57921554406697595141  ,pobs0 = 0.04685295678266754304531 ) == -.Machine$double.eps [1] TRUE > 1 - pzanegbin(q = Inf,size = 3.212259754597279748367,munb = 30.57921554406697595141 ,pobs0 = 0.04685295678266754304531 ) == -.Machine$double.eps
[1] TRUE

According to the help text for .Machine, this value is "the smallest positive floating-point number 'x' such that '1 + x != 1'". This means that for the values you are passing to the function, it is operating near the limits of floating-point accuracy, and any calculations are likely to have very large errors as a result.

0
Entering edit mode

I reported this bug to the package author, and it should be fixed in an upcoming release.