KNN, SVM, and randomForest - How to predict test without known categories
2
0
Entering edit mode
Liu, Xin ▴ 120
@liu-xin-811
Last seen 10.2 years ago
In R, before using KNN, SVM, and randomForest, a expreSet is needed to build, which require the train WITH known catagories and the test WITH known catagories. However, by definition, in supervised learning you always train (with known catagories), then predict the test WITHOUT known catagories. I wonder how to implement this. Thank you! Xin -----Original Message----- From: Tom R. Fahland [mailto:tfahland@genomatica.com] Sent: 27 July 2004 18:48 To: Liu, Xin; bioconductor@stat.math.ethz.ch Subject: RE: [BioC] KNN, SVM,and randomForest - How to predict samples without category By definition, in supervised learning you always train (with known catagories), then run your unbiased data through for prediction. Both CV and train/test partitions are good for choosing parameters and optimizing the algorithms. I have just completed a study predicting dose expsoure with good reasults using different algorithms. Tom -----Original Message----- From: bioconductor-bounces@stat.math.ethz.ch [mailto:bioconductor-bounces@stat.math.ethz.ch] On Behalf Of Liu, Xin Sent: Tuesday, July 27, 2004 07:39 To: bioconductor@stat.math.ethz.ch Subject: [BioC] KNN, SVM,and randomForest - How to predict samples without category Dear all, Supervised clusterings (KNN, SVM, and randomForest) use test sample set and train sample set to do prediction. To create the expreSet, the category is needed for each sample. However sometimes we need to predict sample without its category. Anybody has some clue to do this? Thank you very much! Best regards, Xin LIU This e-mail is from ArraGen Ltd\ \ The e-mail and any files\...{{dropped}}
Category Category • 1.5k views
ADD COMMENT
0
Entering edit mode
@sean-davis-490
Last seen 3 months ago
United States
Xin, There is a wealth of information on the bioconductor website, thanks to many generous and brilliant contributors. One such research is in the documentation section under Lab Materials and is titled 'Application of Machine Learning to Microarray Data, SVM and friends'. A PDF is available (http://www.bioconductor.org/labMat/pdf/MachLearn.pdf) and there are lab materials available, including R code. I encourage all users to peruse these resources frequently--I learn something new every time I look. Sean On Jul 28, 2004, at 4:18 AM, Liu, Xin wrote: > In R, before using KNN, SVM, and randomForest, a expreSet is needed to > build, which require the train WITH known catagories and the test WITH > known catagories. However, by definition, in supervised learning you > always train (with known > catagories), then predict the test WITHOUT known catagories. I wonder > how to implement this. Thank you! > > Xin > > > > > > -----Original Message----- > From: Tom R. Fahland [mailto:tfahland@genomatica.com] > Sent: 27 July 2004 18:48 > To: Liu, Xin; bioconductor@stat.math.ethz.ch > Subject: RE: [BioC] KNN, SVM,and randomForest - How to predict samples > without category > > > By definition, in supervised learning you always train (with known > catagories), then run your unbiased data through for prediction. Both > CV > and train/test partitions are good for choosing parameters and > optimizing the algorithms. I have just completed a study predicting > dose > expsoure with good reasults using different algorithms. > Tom > > -----Original Message----- > From: bioconductor-bounces@stat.math.ethz.ch > [mailto:bioconductor-bounces@stat.math.ethz.ch] On Behalf Of Liu, Xin > Sent: Tuesday, July 27, 2004 07:39 > To: bioconductor@stat.math.ethz.ch > Subject: [BioC] KNN, SVM,and randomForest - How to predict samples > without category > > > Dear all, > > Supervised clusterings (KNN, SVM, and randomForest) use test sample set > and train sample set to do prediction. To create the expreSet, the > category is needed for each sample. However sometimes we need to > predict > sample without its category. Anybody has some clue to do this? Thank > you > very much! > > Best regards, > Xin LIU > > > > This e-mail is from ArraGen Ltd\ \ The e-mail and any > files\...{{dropped}} > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor
ADD COMMENT
0
Entering edit mode
@adaikalavan-ramasamy-675
Last seen 10.2 years ago
If algorithm 1 predicts "Yes", "Yes", "No", "No" for 4 samples and algorithm 2 predicts "Yes", "No", "Yes", "No", how do you know which one is the better algorithm ? So you use tests set with known classes to do this. You can do this by breaking your learning set (samples with know classes) into training and test set. Look up "cross validation". Some example of built in cross validation * knn.cv() is a leave one out cross-validation of knn() * svm() in library(e1071) has an argument named 'cross' for cross validation In practice, I prefer to write my own wrapper for cross-validation to ensure that sampling method is the same across all algorithms. Once you have determined the best algorithm and features, you then use predict() to predict samples with unknown classes. Regards, Adai. On Wed, 2004-07-28 at 09:18, Liu, Xin wrote: > In R, before using KNN, SVM, and randomForest, a expreSet is needed to build, which require the train WITH known catagories and the test WITH known catagories. However, by definition, in supervised learning you always train (with known > catagories), then predict the test WITHOUT known catagories. I wonder how to implement this. Thank you! > > Xin > > > > > > -----Original Message----- > From: Tom R. Fahland [mailto:tfahland@genomatica.com] > Sent: 27 July 2004 18:48 > To: Liu, Xin; bioconductor@stat.math.ethz.ch > Subject: RE: [BioC] KNN, SVM,and randomForest - How to predict samples > without category > > > By definition, in supervised learning you always train (with known > catagories), then run your unbiased data through for prediction. Both CV > and train/test partitions are good for choosing parameters and > optimizing the algorithms. I have just completed a study predicting dose > expsoure with good reasults using different algorithms. > Tom > > -----Original Message----- > From: bioconductor-bounces@stat.math.ethz.ch > [mailto:bioconductor-bounces@stat.math.ethz.ch] On Behalf Of Liu, Xin > Sent: Tuesday, July 27, 2004 07:39 > To: bioconductor@stat.math.ethz.ch > Subject: [BioC] KNN, SVM,and randomForest - How to predict samples > without category > > > Dear all, > > Supervised clusterings (KNN, SVM, and randomForest) use test sample set > and train sample set to do prediction. To create the expreSet, the > category is needed for each sample. However sometimes we need to predict > sample without its category. Anybody has some clue to do this? Thank you > very much! > > Best regards, > Xin LIU > > > > This e-mail is from ArraGen Ltd\ \ The e-mail and any files\...{{dropped}} > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor >
ADD COMMENT

Login before adding your answer.

Traffic: 1053 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6