svm (e1071) class weighting in a multi-class problem
1
0
Entering edit mode
@javier-perez-florido-3121
Last seen 6.0 years ago
Dear list, I have a question related to the class weighting parameter of svm classifier in e1071 package. Class weighting, as stated in the vignette of svm in such package, is useful when asymmetric class sizes are present. For example, for two classes A and B of 50 and 100 samples respectively, a weight of 2 can be assigned to class A and a weight of 1 to class B. However, what happens in a multi-class problem? In e1071 package, SVM follows the one-against-one approach and for K classes, K(K-1)/2 binary classifiers are built. In my case and depending on the comparison a different class weighting is desired. For example, if a problem has 3 classes Class 1: 10 samples Class 2: 20 samples Class 3: 30 samples When a Class1 vs Class 2 classifier is built, I would like to use a weight of 2 for Class 1 and a weight of 1 for Class 2 When a Class1 vs Class 3 classifier is built, I would like to use a weight of 3 for Class 1 and a weight of 1 for Class 3 When a Class2 vs Class 3 classifier is built, I would like to use a weight of 1.5 for class 2 and a weight of 1 for Class 3 How can svm handle this issue? How does svm really handle this issue (class weighting for a multi-class problem)? Thanks for your kindly help, All the best, Javier
• 2.9k views
ADD COMMENT
0
Entering edit mode
@steve-lianoglou-2771
Last seen 13 months ago
United States
Hi Javier, 2012/3/8 Javier P?rez Florido <jpflorido at="" gmail.com="">: > Dear list, > I have a question related to the class weighting parameter of svm classifier > in e1071 package. > Class weighting, as stated in the vignette of svm in such package, is useful > when asymmetric class sizes are present. For example, for two classes A and > B of 50 and 100 samples respectively, a weight of 2 can be assigned to class > A and a weight of 1 to class B. > > However, what happens in a multi-class problem? In e1071 package, SVM > follows the one-against-one approach and for K classes, K(K-1)/2 binary > classifiers are built. In my case and depending on the comparison a > different class weighting is desired. For example, if a problem has 3 > classes > Class 1: 10 samples > Class 2: 20 samples > Class 3: 30 samples > > When a Class1 vs Class 2 classifier is built, I would like to use a weight > of 2 for Class 1 and a weight of 1 for Class 2 > When a Class1 vs Class 3 classifier is built, I would like to use a weight > of 3 for Class 1 and a weight of 1 for Class 3 > When a Class2 vs Class 3 classifier is built, I would like to use a weight > of 1.5 for class 2 and a weight of 1 for Class 3 > > How can svm handle this issue? How does svm really handle this issue (class > weighting for a multi-class problem)? By skimming through the e1071/src/svm.cpp code, it looks like the class weights are a multiplier for the class-specific C term, ie. look at the "soft margin" section here: http://en.wikipedia.org/wiki/Support_vector_machine Specifically the part of the optimization that includes the slack variables: \min_{junk} = ... + C \sum_{i=1}^n \xi_i Since C is just a multiplier on the sum of your slack vars, you can expand it to have imbalanced C specific to your class size (or e1071 "weight"), something like: \min_{junk} = ... + C_1 \sum_{i \in \mbox{Class}_1} \xi_i + C_2 \sum_{j \in \mbox{Class}_2} \xi_j Ugh ... email LaTeX ... anyway, does that make sense? I guess you can get pretty close to what you want by setting class weights to 3,2,1, but not exactly since the class weights in the class 1 vs 2 comparison will be 3 vs 2, not 2 vs 1, but ... -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology ?| Memorial Sloan-Kettering Cancer Center ?| Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact
ADD COMMENT

Login before adding your answer.

Traffic: 972 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6