Question: Which one is more trustable for Cox Proportional Hazard Model with Lasso penalty? LOOCV or K-fold cross-validation?
0
gravatar for Talip Zengin
3 months ago by
Talip Zengin10
Mugla, Turkiye
Talip Zengin10 wrote:

Hi, When I checked articles to find a method to identify predictors of overall survival, mostly glmnet cox model is used with Lasso penalty and cross validation. As cross validation, people used 10-fold cross validation with 1000 iterations or LOOCV.

I read about glmnet method and searched forums for codes. I found two different approaches for 10-fold cross validation: first one is running final glmnet function by using lambda.min value from cv.glmnet results, to create final model; second is (from an article) to find frequency of active covariates from results of cv.glmnet command with 1000 iterations and decide most frequent model as final model. Moreover, I understood that if we adjust nfolds option of cv.glmnet command as number of rows (samples) of response vector (y), it performs LOOCV. For LOOCV, we do not need iterations. If there is any misunderstood part, please correct me.

I tried the codes below and I had close but different results. Therefore I wonder which approach is more trustable, in this case? Note: Approach 3 (LOOCV) takes shorter time than others although disadvantage of LOOCV is time consuming. Is it because of iterations?

Thanks in advance.

library(glmnet)
library(dplyr)

data(CoxExample)
x <- x[1:500,]
y <- y[1:500,]

## Approach 1: 10-fold cross validation with 1000 iterations, then final glmnet run

lambdas = NULL
for (i in 1:1000) {
  fit <- cv.glmnet(x,y,family = "cox", nfolds = 10, alpha = 1, grouped = TRUE)
  errors = data.frame(fit$lambda,fit$cvm)
  lambdas <- rbind(lambdas,errors)
}

# take mean cvm for each lambda
lambdas <- aggregate(lambdas[, 2], list(lambdas$fit.lambda), mean)

# select the best one
bestindex = which(lambdas[2]==min(lambdas[2]))
bestlambda = lambdas[bestindex,1]

# and now run glmnet once more with it
fit <- glmnet(x,y,family = "cox", alpha = 1, lambda=bestlambda)
coef.min = coef(fit, s = "lambda.min")
active.min = which(coef.min != 0)
active.min

## Approach 2: 10-fold cross validation with 1000 iterations, then finding frequency of active covariates
coefs <- NULL
active.mins <- list()
for (i in 1:1000){
  cvfit = cv.glmnet(x, y, family = "cox", nfolds = 10, alpha = 1, grouped = TRUE)
  coef.min = coef(cvfit, s = "lambda.min")
  coefs <- cbind(coefs, coef.min)
  active.min = which(coef.min != 0)
  active.mins <- c(active.mins, list(active.min))
}
table(unlist(lapply(active.mins, paste, collapse = " ")))

## Approach 3: LOOCV through nrow-folds
cvfit <- cv.glmnet(x, y, family = "cox", nfolds = nrow(y), alpha = 1, grouped = TRUE)
cv.coef.min <- coef(cvfit, s = "lambda.min")
cv.active.min <- which(cv.coef.min != 0)
cv.active.min

fit <- glmnet(x, y, family = "cox", alpha = 1, lambda = cvfit$lambda.min)
coef.min <- coef(fit, s = "lambda.min")
active.min <- which(coef.min != 0)
active.min

Results Approach 1:

[1] 1 2 3 4 5 6 7 8 9 10 13 17 18 24 27 29

Approach 2:

[1] 1 2 3 4 5 6 7 8 9 10 13 17 18 24 27 29 (frequency: 966/1000)

Approach 3:

[1] 1 2 3 4 5 6 7 8 9 10 13 14 17 18 24 27 29

ADD COMMENTlink modified 3 months ago by chris86390 • written 3 months ago by Talip Zengin10
Answer: Which one is more trustable for Cox Proportional Hazard Model with Lasso penalty
1
gravatar for chris86
3 months ago by
chris86390
UCL, United Kingdom
chris86390 wrote:

Actually (unless this is just for training, then excuse me) none of this is trustworthy because you are not splitting into separate training and testing datasets, e.g.

https://stackoverflow.com/questions/18130338/plotting-an-roc-curve-in-glmnet

If you want to do cross validation + glmnet just use caret with all the data to do it auto, e.g.

ctrl <- trainControl(method = 'cv', summaryFunction=twoClassSummary, classProbs=T, savePredictions = T, verboseIter = T)

fit2 <- train(as.formula( paste( 'labelsa', '~', '.' ) ), data=test3, method="glmnet", trControl=ctrl, metric = "ROC")

Then fit2 contains your mean AUC.

If this is just for training (your finding the best model on this data for prediction), approach 1 sounds reasonable to me. You can also use caret for finding the best model and pass it a grid of lambda and alpha values.

ADD COMMENTlink modified 3 months ago • written 3 months ago by chris86390
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 143 users visited in the last hour