IDEAS home Printed from https://ideas.repec.org/a/eee/ejores/v261y2017i2p656-665.html
   My bibliography  Save this article

Cost-based feature selection for Support Vector Machines: An application in credit scoring

Author

Listed:
  • Maldonado, Sebastián
  • Pérez, Juan
  • Bravo, Cristián

Abstract

In this work we propose two formulations based on Support Vector Machines for simultaneous classification and feature selection that explicitly incorporate attribute acquisition costs. This is a challenging task for two main reasons: the estimation of the acquisition costs is not straightforward and may depend on multivariate factors, and the inter-dependence between variables must be taken into account for the modelling process since companies usually acquire groups of related variables rather than acquiring them individually. Mixed-integer linear programming models are proposed for constructing classifiers that constrain acquisition costs while classifying adequately. Experimental results using credit scoring datasets demonstrate the effectiveness of our methods in terms of predictive performance at a low cost compared to well-known feature selection approaches.

Suggested Citation

  • Maldonado, Sebastián & Pérez, Juan & Bravo, Cristián, 2017. "Cost-based feature selection for Support Vector Machines: An application in credit scoring," European Journal of Operational Research, Elsevier, vol. 261(2), pages 656-665.
  • Handle: RePEc:eee:ejores:v:261:y:2017:i:2:p:656-665
    DOI: 10.1016/j.ejor.2017.02.037
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0377221717301595
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.ejor.2017.02.037?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to

    for a different version of it.

    References listed on IDEAS

    as
    1. Crone, Sven F. & Finlay, Steven, 2012. "Instance sampling in credit scoring: An empirical study of sample size and balancing," International Journal of Forecasting, Elsevier, vol. 28(1), pages 224-238.
    2. Bravo, Cristián & Maldonado, Sebastián & Weber, Richard, 2013. "Granting and managing loans for micro-entrepreneurs: New developments and practical experiences," European Journal of Operational Research, Elsevier, vol. 227(2), pages 358-366.
    3. Verbraken, Thomas & Bravo, Cristián & Weber, Richard & Baesens, Bart, 2014. "Development and application of consumer credit scoring models using profit-based classification measures," European Journal of Operational Research, Elsevier, vol. 238(2), pages 505-513.
    4. Carrizosa, Emilio & Martín-Barragán, Belén & Morales, Dolores Romero, 2011. "Detecting relevant variables and interactions in supervised classification," European Journal of Operational Research, Elsevier, vol. 213(1), pages 260-269, August.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Medina-Olivares, Victor & Calabrese, Raffaella & Dong, Yizhe & Shi, Baofeng, 2022. "Spatial dependence in microfinance credit default," International Journal of Forecasting, Elsevier, vol. 38(3), pages 1071-1085.
    2. Luisa Roa & Alejandro Correa-Bahnsen & Gabriel Suarez & Fernando Cort'es-Tejada & Mar'ia A. Luque & Cristi'an Bravo, 2020. "Super-App Behavioral Patterns in Credit Risk Models: Financial, Statistical and Regulatory Implications," Papers 2005.14658, arXiv.org, revised Jan 2021.
    3. Rasa Kanapickiene & Renatas Spicas, 2019. "Credit Risk Assessment Model for Small and Micro-Enterprises: The Case of Lithuania," Risks, MDPI, vol. 7(2), pages 1-23, June.
    4. Carlos Serrano-Cinca & Begoña Gutiérrez-Nieto & Luz López-Palacios, 2015. "Determinants of Default in P2P Lending," PLOS ONE, Public Library of Science, vol. 10(10), pages 1-22, October.
    5. Shen, Feng & Zhang, Xin & Wang, Run & Lan, Dao & Zhou, Wei, 2022. "Sequential optimization three-way decision model with information gain for credit default risk evaluation," International Journal of Forecasting, Elsevier, vol. 38(3), pages 1116-1128.
    6. Kozodoi, Nikita & Lessmann, Stefan & Alamgir, Morteza & Moreira-Matias, Luis & Papakonstantinou, Konstantinos, 2025. "Fighting sampling bias: A framework for training and evaluating credit scoring models," European Journal of Operational Research, Elsevier, vol. 324(2), pages 616-628.
    7. Gero Szepannek, 2022. "An Overview on the Landscape of R Packages for Open Source Scorecard Modelling," Risks, MDPI, vol. 10(3), pages 1-33, March.
    8. Li, Yibei & Wang, Ximei & Djehiche, Boualem & Hu, Xiaoming, 2020. "Credit scoring by incorporating dynamic networked information," European Journal of Operational Research, Elsevier, vol. 286(3), pages 1103-1112.
    9. He Jiang, 2022. "A novel robust structural quadratic forecasting model and applications," Journal of Forecasting, John Wiley & Sons, Ltd., vol. 41(6), pages 1156-1180, September.
    10. Casado Yusta, Silvia & Nœ–ez Letamendía, Laura & Pacheco Bonrostro, Joaqu’n Antonio, 2018. "Predicting Corporate Failure: The GRASP-LOGIT Model || Predicci—n de la quiebra empresarial: el modelo GRASP-LOGIT," Revista de Métodos Cuantitativos para la Economía y la Empresa = Journal of Quantitative Methods for Economics and Business Administration, Universidad Pablo de Olavide, Department of Quantitative Methods for Economics and Business Administration, vol. 26(1), pages 294-314, Diciembre.
    11. Tsukahara, Fábio Yasuhiro & Kimura, Herbert & Sobreiro, Vinicius Amorim & Zambrano, Juan Carlos Arismendi, 2016. "Validation of default probability models: A stress testing approach," International Review of Financial Analysis, Elsevier, vol. 47(C), pages 70-85.
    12. Michael Bucker & Gero Szepannek & Alicja Gosiewska & Przemyslaw Biecek, 2020. "Transparency, Auditability and eXplainability of Machine Learning Models in Credit Scoring," Papers 2009.13384, arXiv.org.
    13. Liu, Zhenkun & Zhang, Ying & Abedin, Mohammad Zoynul & Wang, Jianzhou & Yang, Hufang & Gao, Yuyang & Chen, Yinghao, 2024. "Profit-driven fusion framework based on bagging and boosting classifiers for potential purchaser prediction," Journal of Retailing and Consumer Services, Elsevier, vol. 79(C).
    14. Nadia Ayed & Khemaies Bougatef, 2024. "Performance Assessment of Logistic Regression (LR), Artificial Neural Network (ANN), Fuzzy Inference System (FIS) and Adaptive Neuro-Fuzzy System (ANFIS) in Predicting Default Probability: The Case of," Computational Economics, Springer;Society for Computational Economics, vol. 64(3), pages 1803-1835, September.
    15. Ruize Gao & Shaoze Cui & Yu Wang & Wei Xu, 2025. "Predicting financial distress in high-dimensional imbalanced datasets: a multi-heterogeneous self-paced ensemble learning framework," Financial Innovation, Springer;Southwestern University of Finance and Economics, vol. 11(1), pages 1-34, December.
    16. Dumitrescu, Elena & Hué, Sullivan & Hurlin, Christophe & Tokpavi, Sessi, 2022. "Machine learning for credit scoring: Improving logistic regression with non-linear decision-tree effects," European Journal of Operational Research, Elsevier, vol. 297(3), pages 1178-1192.
    17. Yang Liu & Fei Huang & Lili Ma & Qingguo Zeng & Jiale Shi, 2024. "Credit scoring prediction leveraging interpretable ensemble learning," Journal of Forecasting, John Wiley & Sons, Ltd., vol. 43(2), pages 286-308, March.
    18. Lessmann, Stefan & Baesens, Bart & Seow, Hsin-Vonn & Thomas, Lyn C., 2015. "Benchmarking state-of-the-art classification algorithms for credit scoring: An update of research," European Journal of Operational Research, Elsevier, vol. 247(1), pages 124-136.
    19. Maarouf, Abdurahman & Feuerriegel, Stefan & Pröllochs, Nicolas, 2025. "A fused large language model for predicting startup success," European Journal of Operational Research, Elsevier, vol. 322(1), pages 198-214.
    20. Liu, Zhenkun & Jiang, Ping & De Bock, Koen W. & Wang, Jianzhou & Zhang, Lifang & Niu, Xinsong, 2024. "Extreme gradient boosting trees with efficient Bayesian optimization for profit-driven customer churn prediction," Technological Forecasting and Social Change, Elsevier, vol. 198(C).

    More about this item

    Keywords

    ;
    ;
    ;
    ;
    ;

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:ejores:v:261:y:2017:i:2:p:656-665. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/eor .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.