IDEAS home Printed from https://ideas.repec.org/a/gam/jmathe/v9y2021i5p579-d513498.html
   My bibliography  Save this article

RiskLogitboost Regression for Rare Events in Binary Response: An Econometric Approach

Author

Listed:
  • Jessica Pesantez-Narvaez

    (Department of Econometrics, Riskcenter-IREA, Universitat de Barcelona, 08034 Barcelona, Spain)

  • Montserrat Guillen

    (Department of Econometrics, Riskcenter-IREA, Universitat de Barcelona, 08034 Barcelona, Spain)

  • Manuela Alcañiz

    (Department of Econometrics, Riskcenter-IREA, Universitat de Barcelona, 08034 Barcelona, Spain)

Abstract

A boosting-based machine learning algorithm is presented to model a binary response with large imbalance, i.e., a rare event. The new method (i) reduces the prediction error of the rare class, and (ii) approximates an econometric model that allows interpretability. RiskLogitboost regression includes a weighting mechanism that oversamples or undersamples observations according to their misclassification likelihood and a generalized least squares bias correction strategy to reduce the prediction error. An illustration using a real French third-party liability motor insurance data set is presented. The results show that RiskLogitboost regression improves the rate of detection of rare events compared to some boosting-based and tree-based algorithms and some existing methods designed to treat imbalanced responses.

Suggested Citation

  • Jessica Pesantez-Narvaez & Montserrat Guillen & Manuela Alcañiz, 2021. "RiskLogitboost Regression for Rare Events in Binary Response: An Econometric Approach," Mathematics, MDPI, vol. 9(5), pages 1-21, March.
  • Handle: RePEc:gam:jmathe:v:9:y:2021:i:5:p:579-:d:513498
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2227-7390/9/5/579/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2227-7390/9/5/579/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. King, Gary & Zeng, Langche, 2001. "Logistic Regression in Rare Events Data," Political Analysis, Cambridge University Press, vol. 9(2), pages 137-163, January.
    2. Cuiqing Jiang & Zhao Wang & Ruiya Wang & Yong Ding, 2018. "Loan default prediction by combining soft information extracted from descriptive text in online peer-to-peer lending," Annals of Operations Research, Springer, vol. 266(1), pages 511-529, July.
    3. Yufei Jin & Roderick Rejesus & Bertis Little, 2005. "Binary choice models for rare events data: a crop insurance fraud application," Applied Economics, Taylor & Francis Journals, vol. 37(7), pages 841-848.
    4. Maalouf, Maher & Trafalis, Theodore B., 2011. "Robust weighted kernel logistic regression in imbalanced and rare events data," Computational Statistics & Data Analysis, Elsevier, vol. 55(1), pages 168-183, January.
    5. Zaremba, Adam & Czapkiewicz, Anna, 2017. "Digesting anomalies in emerging European markets: A comparison of factor pricing models," Emerging Markets Review, Elsevier, vol. 31(C), pages 1-15.
    6. Carpenter, Daniel P. & Lewis, David E., 2004. "Political Learning from Rare Events: Poisson Inference, Fiscal Constraints, and the Lifetime of Bureaus," Political Analysis, Cambridge University Press, vol. 12(3), pages 201-232, July.
    7. Jessica Pesantez-Narvaez & Montserrat Guillen & Manuela Alcañiz, 2021. "A Synthetic Penalized Logitboost to Model Mortgage Lending with Imbalanced Data," Computational Economics, Springer;Society for Computational Economics, vol. 57(1), pages 281-309, January.
    8. Simon C. K. Lee & Sheldon Lin, 2018. "Delta Boosting Machine with Application to General Insurance," North American Actuarial Journal, Taylor & Francis Journals, vol. 22(3), pages 405-425, July.
    9. Jessica Pesantez-Narvaez & Montserrat Guillen & Manuela Alcañiz, 2019. "Predicting Motor Insurance Claims Using Telematics Data—XGBoost versus Logistic Regression," Risks, MDPI, vol. 7(2), pages 1-16, June.
    10. Cook, Scott J. & Hays, Jude C. & Franzese, Robert J., 2020. "Fixed effects in rare events data: a penalized maximum likelihood solution," Political Science Research and Methods, Cambridge University Press, vol. 8(1), pages 92-105, January.
    11. Bo, Lijun & Wang, Yongjin & Yang, Xuewei, 2010. "Markov-modulated jump-diffusions for currency option pricing," Insurance: Mathematics and Economics, Elsevier, vol. 46(3), pages 461-469, June.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Blackman, Allen & Guerrero, Santiago, 2012. "What drives voluntary eco-certification in Mexico?," Journal of Comparative Economics, Elsevier, vol. 40(2), pages 256-268.
    2. Roth, Paula, 2020. "Inequality, Relative Deprivation and Financial Distress: Evidence from Swedish Register Data," Working Paper Series 1374, Research Institute of Industrial Economics.
    3. Zhiyu Quan & Changyue Hu & Panyi Dong & Emiliano A. Valdez, 2024. "Improving Business Insurance Loss Models by Leveraging InsurTech Innovation," Papers 2401.16723, arXiv.org.
    4. Dustin C.S. Wagner & Kash Barker, 2014. "Statistical methods for modeling the risk of runway excursions," Journal of Risk Research, Taylor & Francis Journals, vol. 17(7), pages 885-901, August.
    5. Kyungwon Suh, 2023. "Nuclear balance and the initiation of nuclear crises: Does superiority matter?," Journal of Peace Research, Peace Research Institute Oslo, vol. 60(2), pages 337-351, March.
    6. Kenchington, David G. & Shohfi, Thomas D. & Smith, Jared D. & White, Roger M., 2022. "Do sin tax hikes spur cheating in interpersonal exchange?," Accounting, Organizations and Society, Elsevier, vol. 96(C).
    7. Neuberg Richard & Hannah Lauren, 2017. "Loan pricing under estimation risk," Statistics & Risk Modeling, De Gruyter, vol. 34(1-2), pages 69-87, June.
    8. Hani M. Samawi & Haresh Rochani & Daniel Linder & Arpita Chatterjee, 2017. "More efficient logistic analysis using moving extreme ranked set sampling," Journal of Applied Statistics, Taylor & Francis Journals, vol. 44(4), pages 753-766, March.
    9. Tang, Xinyin & Feng, Chong & Zhu, Jianping & He, Minna, 2022. "How Can We Learn from Borrowers’ Online Behaviors? The Signal Effect of Borrowers’ Platform Involvement on Their Credit Risk," SocArXiv qga8j, Center for Open Science.
    10. Trufin, Julien & Denuit, Michel, 2021. "Boosting cost-complexity pruned trees On Tweedie responses: the ABT machine," LIDAM Discussion Papers ISBA 2021015, Université catholique de Louvain, Institute of Statistics, Biostatistics and Actuarial Sciences (ISBA).
    11. Jessica Pesantez-Narvaez & Montserrat Guillen & Manuela Alcañiz, 2021. "A Synthetic Penalized Logitboost to Model Mortgage Lending with Imbalanced Data," Computational Economics, Springer;Society for Computational Economics, vol. 57(1), pages 281-309, January.
    12. Denisa BANULESCU-RADU & Meryem YANKOL-SCHALCK, 2021. "Fraud detection in the era of Machine Learning: a household insurance case," LEO Working Papers / DR LEO 2904, Orleans Economics Laboratory / Laboratoire d'Economie d'Orleans (LEO), University of Orleans.
    13. Milan Kumar Das & Anindya Goswami, 2019. "Testing of binary regime switching models using squeeze duration analysis," International Journal of Financial Engineering (IJFE), World Scientific Publishing Co. Pte. Ltd., vol. 6(01), pages 1-20, March.
    14. Angel M. Morales & Patrick Tarwater & Indika Mallawaarachchi & Alok Kumar Dwivedi & Juan B. Figueroa-Casas, 2015. "Multinomial logistic regression approach for the evaluation of binary diagnostic test in medical research," Statistics in Transition new series, Główny Urząd Statystyczny (Polska), vol. 16(2), pages 203-222, June.
    15. F. Gauthier & D. Germain & B. Hétu, 2017. "Logistic models as a forecasting tool for snow avalanches in a cold maritime climate: northern Gaspésie, Québec, Canada," Natural Hazards: Journal of the International Society for the Prevention and Mitigation of Natural Hazards, Springer;International Society for the Prevention and Mitigation of Natural Hazards, vol. 89(1), pages 201-232, October.
    16. Douglas Cumming & Lars Hornuf & Moein Karami & Denis Schweizer, 2023. "Disentangling Crowdfunding from Fraudfunding," Journal of Business Ethics, Springer, vol. 182(4), pages 1103-1128, February.
    17. Eunae Yoo & Elliot Rabinovich & Bin Gu, 2020. "The Growth of Follower Networks on Social Media Platforms for Humanitarian Operations," Production and Operations Management, Production and Operations Management Society, vol. 29(12), pages 2696-2715, December.
    18. Cemal Eren Arbatlı & Quamrul H. Ashraf & Oded Galor & Marc Klemp, 2020. "Diversity and Conflict," Econometrica, Econometric Society, vol. 88(2), pages 727-797, March.
    19. Lo Turco, Alessia & Maggioni, Daniela, 2018. "Effects of Islamic religiosity on bilateral trust in trade: The case of Turkish exports," Journal of Comparative Economics, Elsevier, vol. 46(4), pages 947-965.
    20. Matija Kovacic & Claudio Zoli, 2021. "Ethnic distribution, effective power and conflict," Social Choice and Welfare, Springer;The Society for Social Choice and Welfare, vol. 57(2), pages 257-299, August.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jmathe:v:9:y:2021:i:5:p:579-:d:513498. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.