**70**wrote:

Hello: I am trying to reproduce an example given in MAST reference for zlm() where a data frame with 500 points is used. I assume this data set describes one single feature, hence no shrinkage is applied.

I managed to reproduce the continuous part. The regression coefficients for the discrete part are slightly off. MAST output says that the null deviance is on 499 df (i.e. all 500 points are used), but the residual deviance is on 214 df, even though the logistic regression has only 3 parameters. What did I get wrong?

```
set.seed(314)
data<- data.frame(x=rnorm(500), z=rbinom(500, 1, .3))
logit.y <- with(data, x*2 + z*2); mu.y <- with(data, 10+10*x+10*z + rnorm(500))
y <- (runif(500)<exp(logit.y)/(1+exp(logit.y)))*1
y[y>0] <- mu.y[y>0]
data$y <- y
fit <- zlm(y ~ x+z, data)
summary.glm(fit$disc)
# Coefficients:
# Estimate Std. Error z value Pr(>|z|)
# (Intercept) -0.07739 0.13713 -0.564 0.573
# x 2.30682 0.21370 10.795 < 2e-16 ***
# z 2.75969 0.35173 7.846 4.29e-15 ***
# Null deviance: 684.41 on 499 degrees of freedom
# Residual deviance: 396.20 on 214 degrees of freedom
# AIC: 402.2
summary.glm(fit$cont)
# Check whether the continous part can be obtained via lm()
cont.data <- data[data$y > 0, ]
lm.fit <- lm(y ~ x + z, data = cont.data )
summary(lm.fit)
# t-values are the same as in fit$cont.
# Check if the discrete part can be obtained via logistic regression
log.resp <- ifelse(data$y > 0, 1, 0)
logistic.fit <- glm(log.resp ~ data$x + data$z, family=binomial(link='logit'))
summary(logistic.fit)
# Coefficients:
# Estimate Std. Error z value Pr(>|z|)
# (Intercept) -0.08737 0.13887 -0.629 0.529
# data$x 2.38849 0.22437 10.645 < 2e-16 ***
# data$z 2.86694 0.36538 7.847 4.28e-15 ***
# Null deviance: 684.41 on 499 degrees of freedom
# Residual deviance: 396.04 on 497 degrees of freedom
# AIC: 402.04
# Note the difference in residual deviance df.
```

**190**• written 5 months ago by Nik Tuzov •

**70**