Question: Adaptive Lasso and CrossValidation for SNV selection
0
gravatar for crichard
2.8 years ago by
crichard0
crichard0 wrote:

Hello everyone,

 

I have available 17 000 variables (SNV frequencies, a certain number of zeros) for 40 patients. Each patient is represented by its response to a treatment : 13 responses, 27 no-responses. I want to extract a subset of SNV which can have strong prediction power.

Because of the large size of set of variables, there are strong correlations, that's why I'm considering adaptive-lasso. I used glmnet R package, Ridge initial estimated coefficients and the following R code :

library(cvTools)
library(glmnet)

err.test.response <- c()
err.test.noresponse <- c()
nbiters <- 50

for(i in 1:nbiters){
  ## k folds
  kflds <- 8
  flds <- cvFolds(length(y), K = kflds)
 
  pred.test <- c() ## predicted classes
  class.test <- c() ## real classes
 
  for(j in 1:kflds){
    ## Train
    x.train <- x[flds$which!=j,]
    y.train <- y[flds$which!=j]
    ## Test
    x.test <- x[flds$which==j,]
    y.test <- y[flds$which==j]
    ## Adaptive Weights Vetor
    cv.ridge <- cv.glmnet(x.train, y.train, family='binomial', alpha=0, standardize=FALSE,
                          parallel = TRUE, nfolds = 7)
    w3 <- 1/abs(matrix(coef(cv.ridge, s=cv.ridge$lambda.min)[, 1][2:(ncol(x)+1)] ))^1
    w3[w3[,1] == Inf] <- 999999999 
    
    ## Adaptive Lasso
    cv.lasso <- cv.glmnet(x.train, y.train, family='binomial', alpha=1, standardize=FALSE,
                          parallel = TRUE, type.measure='class', penalty.factor=w3, nfolds = 7)
    ## Prediction
    pred.test <- c(pred.test, predict(cv.lasso, x.test, s = 'lambda.1se', type = c("class")))
    class.test <- c(class.test, as.character(y.test))
  }
 
  ## Prediction error
  err.test.noresponse <- c(err.test.noresponse, 1-sum(pred.test=="noresponse"&class.test=="noresponse")
                        /sum(class.test=="noresponse")) # noresponse error vector
  err.test.response <- c(err.test.response, 1-sum(pred.test=="response"&class.test=="response")
                        /sum(class.test=="response")) # response error vector
}

mean(err.test.noresponse) ## Mean noresponse prediction error
mean(err.test.response) ## Mean response prediction error

 

Is it good to do an external cross-validation like this to evaluate adaptive-lasso prediction power on my data ?

My results are not conclusive at all, I have mean(err.test.noresponse) = 0.15 and mean(err.test.response)=0.88, so my model doesn't succeed to identify the response. Have you got an idea why my results are so bad and how could I improve this ?

 

Thanks for your help and your ideas,

Corentin

ADD COMMENTlink modified 2.8 years ago • written 2.8 years ago by crichard0

Nobody has an idea ?

ADD REPLYlink written 2.8 years ago by crichard0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 147 users visited in the last hour