Binarized support vector machines
The widely used Support Vector Machine (SVM) method has shown to yield very good results in Supervised Classification problems. Other methods such as Classification Trees have become more popular among practitioners than SVM thanks to their interpretability, which is an important issue in Data Mining. In this work, we propose an SVM-based method that automatically detects the most important predictor variables, and the role they play in the classifier. In particular, the proposed method is able to detect those values and intervals which are critical for the classification. The method involves the optimization of a Linear Programming problem, with a large number of decision variables. The numerical experience reported shows that a rather direct use of the standard Column-Generation strategy leads to a classification method which, in terms of classification ability, is competitive against the standard linear SVM and Classification Trees. Moreover, the proposed method is robust, i.e., it is stable in the presence of outliers and invariant to change of scale or measurement units of the predictor variables. When the complexity of the classifier is an important issue, a wrapper feature selection method is applied, yielding simpler, still competitive, classifiers.
|Date of creation:||Nov 2007|
|Date of revision:|
|Contact details of provider:|| Web page: http://portal.uc3m.es/portal/page/portal/dpto_estadistica|
When requesting a correction, please mention this item's handle: RePEc:cte:wsrepe:ws086523. See general information about how to correct material in RePEc.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: (Ana Poveda)
If references are entirely missing, you can add them using this form.