question regarding .632plus error rate estimator in ipred package
1
0
Entering edit mode
@james-anderson-1641
Last seen 9.6 years ago
An embedded and charset-unspecified text was scrubbed... Name: not available Url: https://stat.ethz.ch/pipermail/bioconductor/attachments/20071030/ 88913ce3/attachment.pl
• 635 views
ADD COMMENT
0
Entering edit mode
Kuhn, Max ▴ 70
@kuhn-max-1170
Last seen 9.6 years ago
James, I think that there is some confusion here: > there is .632plus estimator, but seems that this estimator > does not have feature selection built in The 632 estimator is a method of evaluating model performance form a training set (using the bootstrap). It knows nothing about the model. Feature selection methods happen either as wrappers around the model or, for some models, as built-in qualities of the model (e.g. rpart or nearest shrunken centroids). Functionally, feature selection has nothing to do with resampling estimators of model quality. In practice, it is more complicated. You should take great care when estimating performance on a training set when you are using a feature selection algorithm. You should read: www.pnas.org/cgi/content/abstract/99/10/6562 bioinformatics.oxfordjournals.org/cgi/content/abstract/btm344v1 and the references therein. Max -----Original Message----- From: bioconductor-bounces@stat.math.ethz.ch [mailto:bioconductor-bounces at stat.math.ethz.ch] On Behalf Of James Anderson Sent: Tuesday, October 30, 2007 5:29 PM To: bioconductor Subject: [BioC] question regarding .632plus error rate estimator in ipredpackage Sorry to bother those who are not interested. In the ipred package, there is .632plus estimator, but seems that this estimator does not have feature selection built in. If that is the case, I am wondering how this can be applied to microarray, since feature selection is a must for microarray. If feature selection is done on the entire dataset and perform .632plus later, there will be some bias with the leave-one-out bootstrap part. I think other estimators should be the same in the sense that it is done on the dataset without performing feature selection. Is what I understand correct or not? __________________________________________________ [[alternative HTML version deleted]] _______________________________________________ Bioconductor mailing list Bioconductor at stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/bioconductor Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD COMMENT

Login before adding your answer.

Traffic: 760 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6