IDEAS home Printed from https://ideas.repec.org/a/eee/ejores/v314y2024i1p297-307.html
   My bibliography  Save this article

Column generation-based prototype learning for optimizing area under the receiver operating characteristic curve

Author

Listed:
  • Ozcan, Erhan C.
  • Görgülü, Berk
  • Baydogan, Mustafa G.

Abstract

The traditional classification algorithms focus on the maximization of classification accuracy which might lead to poor performance in practice by forcing classifiers to overfit to the majority class. In order to overcome this issue, various approaches focus on the optimization of alternative loss functions such as the Area Under the Curve (AUC). AUC is a Receiver Operating Characteristics (ROC) metric that has been widely used to measure classification performance, especially when there are class imbalances. In this work, we propose a column generation (CG)-based algorithm called Ranking-CG, which learns a model, similar to the popular Ranking SVM, through approximate maximization of the AUC. Unlike the Ranking SVM, our algorithm utilizes a column generation method that iteratively adds features to control the model complexity effectively working as an internal feature selection procedure. Our experiments show that column generation can be an important tool to prevent overfitting. We extend the Ranking-CG by proposing a prototype generation method, denoted by Ranking-CG Prototype, that constructs reference points by solving a non-linear optimization problem. Based on the extensive experiments conducted on 74 binary classification problems, the Ranking-CG Prototype yields the best average test AUC among all competing methods by using significantly few features than other benchmarks.

Suggested Citation

  • Ozcan, Erhan C. & Görgülü, Berk & Baydogan, Mustafa G., 2024. "Column generation-based prototype learning for optimizing area under the receiver operating characteristic curve," European Journal of Operational Research, Elsevier, vol. 314(1), pages 297-307.
  • Handle: RePEc:eee:ejores:v:314:y:2024:i:1:p:297-307
    DOI: 10.1016/j.ejor.2023.11.016
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0377221723008573
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.ejor.2023.11.016?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:ejores:v:314:y:2024:i:1:p:297-307. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/eor .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.