IDEAS home Printed from https://ideas.repec.org/a/gam/jrisks/v7y2019i2p70-d241617.html
   My bibliography  Save this article

Predicting Motor Insurance Claims Using Telematics Data—XGBoost versus Logistic Regression

Author

Listed:
  • Jessica Pesantez-Narvaez

    (Department of Econometrics, Riskcenter-IREA, Universitat de Barcelona, 08034 Barcelona, Spain)

  • Montserrat Guillen

    (Department of Econometrics, Riskcenter-IREA, Universitat de Barcelona, 08034 Barcelona, Spain)

  • Manuela Alcañiz

    (Department of Econometrics, Riskcenter-IREA, Universitat de Barcelona, 08034 Barcelona, Spain)

Abstract

XGBoost is recognized as an algorithm with exceptional predictive capacity. Models for a binary response indicating the existence of accident claims versus no claims can be used to identify the determinants of traffic accidents. This study compared the relative performances of logistic regression and XGBoost approaches for predicting the existence of accident claims using telematics data. The dataset contained information from an insurance company about the individuals’ driving patterns—including total annual distance driven and percentage of total distance driven in urban areas. Our findings showed that logistic regression is a suitable model given its interpretability and good predictive capacity. XGBoost requires numerous model-tuning procedures to match the predictive performance of the logistic regression model and greater effort as regards to interpretation.

Suggested Citation

  • Jessica Pesantez-Narvaez & Montserrat Guillen & Manuela Alcañiz, 2019. "Predicting Motor Insurance Claims Using Telematics Data—XGBoost versus Logistic Regression," Risks, MDPI, vol. 7(2), pages 1-16, June.
  • Handle: RePEc:gam:jrisks:v:7:y:2019:i:2:p:70-:d:241617
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2227-9091/7/2/70/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2227-9091/7/2/70/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Roel Verbelen & Katrien Antonio & Gerda Claeskens, 2018. "Unravelling the predictive power of telematics data in car insurance pricing," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 67(5), pages 1275-1304, November.
    2. Guangyuan Gao & Mario V. Wüthrich, 2019. "Convolutional Neural Network Classification of Telematics Car Driving Data," Risks, MDPI, vol. 7(1), pages 1-18, January.
    3. Pieter-Tjerk de Boer & Dirk Kroese & Shie Mannor & Reuven Rubinstein, 2005. "A Tutorial on the Cross-Entropy Method," Annals of Operations Research, Springer, vol. 134(1), pages 19-67, February.
    4. Simon C. K. Lee & Sheldon Lin, 2018. "Delta Boosting Machine with Application to General Insurance," North American Actuarial Journal, Taylor & Francis Journals, vol. 22(3), pages 405-425, July.
    5. Hultkrantz, Lars & Nilsson, Jan-Eric & Arvidsson, Sara, 2012. "Voluntary internalization of speeding externalities with vehicle insurance," Transportation Research Part A: Policy and Practice, Elsevier, vol. 46(6), pages 926-937.
    6. Jianhua Z. Huang & Lijian Yang, 2004. "Identification of non‐linear additive autoregressive models," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 66(2), pages 463-477, May.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Jessica Pesantez-Narvaez & Montserrat Guillen & Manuela Alcañiz, 2021. "A Synthetic Penalized Logitboost to Model Mortgage Lending with Imbalanced Data," Computational Economics, Springer;Society for Computational Economics, vol. 57(1), pages 281-309, January.
    2. Trufin, Julien & Denuit, Michel, 2021. "Boosting cost-complexity pruned trees On Tweedie responses: the ABT machine," LIDAM Discussion Papers ISBA 2021015, Université catholique de Louvain, Institute of Statistics, Biostatistics and Actuarial Sciences (ISBA).
    3. Meng, Shengwang & Gao, Yaqian & Huang, Yifan, 2022. "Actuarial intelligence in auto insurance: Claim frequency modeling with driving behavior features and improved boosted trees," Insurance: Mathematics and Economics, Elsevier, vol. 106(C), pages 115-127.
    4. Zuleyka Díaz Martínez & José Fernández Menéndez & Luis Javier García Villalba, 2023. "Tariff Analysis in Automobile Insurance: Is It Time to Switch from Generalized Linear Models to Generalized Additive Models?," Mathematics, MDPI, vol. 11(18), pages 1-16, September.
    5. Zhiyu Quan & Changyue Hu & Panyi Dong & Emiliano A. Valdez, 2024. "Improving Business Insurance Loss Models by Leveraging InsurTech Innovation," Papers 2401.16723, arXiv.org.
    6. Francis Duval & Jean‐Philippe Boucher & Mathieu Pigeon, 2023. "Enhancing claim classification with feature extraction from anomaly‐detection‐derived routine and peculiarity profiles," Journal of Risk & Insurance, The American Risk and Insurance Association, vol. 90(2), pages 421-458, June.
    7. Nelson Kemboi Yego & Juma Kasozi & Joseph Nkurunziza, 2021. "A Comparative Analysis of Machine Learning Models for the Prediction of Insurance Uptake in Kenya," Data, MDPI, vol. 6(11), pages 1-17, November.
    8. Viktor Stojkoski & Petar Jolakoski & Igor Ivanovski, 2021. "The short‐run impact of COVID‐19 on the activity in the insurance industry in the Republic of North Macedonia," Risk Management and Insurance Review, American Risk and Insurance Association, vol. 24(3), pages 221-242, September.
    9. Thomas Poufinas & Periklis Gogas & Theophilos Papadimitriou & Emmanouil Zaganidis, 2023. "Machine Learning in Forecasting Motor Insurance Claims," Risks, MDPI, vol. 11(9), pages 1-19, September.
    10. Nemanja Milanović & Miloš Milosavljević & Slađana Benković & Dušan Starčević & Željko Spasenić, 2020. "An Acceptance Approach for Novel Technologies in Car Insurance," Sustainability, MDPI, vol. 12(24), pages 1-15, December.
    11. Jessica Pesantez-Narvaez & Montserrat Guillen & Manuela Alcañiz, 2021. "RiskLogitboost Regression for Rare Events in Binary Response: An Econometric Approach," Mathematics, MDPI, vol. 9(5), pages 1-21, March.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Christopher Blier-Wong & Hélène Cossette & Luc Lamontagne & Etienne Marceau, 2020. "Machine Learning in P&C Insurance: A Review for Pricing and Reserving," Risks, MDPI, vol. 9(1), pages 1-26, December.
    2. Zhiyu Quan & Changyue Hu & Panyi Dong & Emiliano A. Valdez, 2024. "Improving Business Insurance Loss Models by Leveraging InsurTech Innovation," Papers 2401.16723, arXiv.org.
    3. Donatella Porrini & Giulio Fusco & Cosimo Magazzino, 2020. "Black boxes and market efficiency: the effect on premiums in the Italian motor-vehicle insurance market," European Journal of Law and Economics, Springer, vol. 49(3), pages 455-472, June.
    4. Shengkun Xie, 2021. "Improving Explainability of Major Risk Factors in Artificial Neural Networks for Auto Insurance Rate Regulation," Risks, MDPI, vol. 9(7), pages 1-21, July.
    5. Yujiao Yang & Qiongxia Song, 2014. "Jump detection in time series nonparametric regression models: a polynomial spline approach," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 66(2), pages 325-344, April.
    6. Deprez, Laurens & Antonio, Katrien & Boute, Robert, 2021. "Pricing service maintenance contracts using predictive analytics," European Journal of Operational Research, Elsevier, vol. 290(2), pages 530-545.
    7. Mattrand, C. & Bourinet, J.-M., 2014. "The cross-entropy method for reliability assessment of cracked structures subjected to random Markovian loads," Reliability Engineering and System Safety, Elsevier, vol. 123(C), pages 171-182.
    8. R. Y. Rubinstein, 2005. "A Stochastic Minimum Cross-Entropy Method for Combinatorial Optimization and Rare-event Estimation," Methodology and Computing in Applied Probability, Springer, vol. 7(1), pages 5-50, March.
    9. Alfiero, Simona & Battisti, Enrico & Ηadjielias, Elias, 2022. "Black box technology, usage-based insurance, and prediction of purchase behavior: Evidence from the auto insurance sector," Technological Forecasting and Social Change, Elsevier, vol. 183(C).
    10. Kin-Ping Hui, 2011. "Cooperative Cross-Entropy method for generating entangled networks," Annals of Operations Research, Springer, vol. 189(1), pages 205-214, September.
    11. Mathieu Balesdent & Jérôme Morio & Loïc Brevault, 2016. "Rare Event Probability Estimation in the Presence of Epistemic Uncertainty on Input Probability Distribution Parameters," Methodology and Computing in Applied Probability, Springer, vol. 18(1), pages 197-216, March.
    12. Tran, Cong Quoc & Keyvan-Ekbatani, Mehdi & Ngoduy, Dong & Watling, David, 2021. "Stochasticity and environmental cost inclusion for electric vehicles fast-charging facility deployment," Transportation Research Part E: Logistics and Transportation Review, Elsevier, vol. 154(C).
    13. Mercedes Ayuso & Montserrat Guillen & Ana María Pérez-Marín, 2016. "Telematics and Gender Discrimination: Some Usage-Based Evidence on Whether Men’s Risk of Accidents Differs from Women’s," Risks, MDPI, vol. 4(2), pages 1-10, April.
    14. Xi Chen & Enlu Zhou, 2015. "Population model-based optimization," Journal of Global Optimization, Springer, vol. 63(1), pages 125-148, September.
    15. Etye Steinberg, 2022. "Run for Your Life: The Ethics of Behavioral Tracking in Insurance," Journal of Business Ethics, Springer, vol. 179(3), pages 665-682, September.
    16. Germ`a Coenders & N'uria Arimany Serrat, 2023. "Accounting statement analysis at industry level. A gentle introduction to the compositional approach," Papers 2305.16842, arXiv.org, revised Feb 2024.
    17. Lvyang Qiu & Shuyu Li & Yunsick Sung, 2021. "3D-DCDAE: Unsupervised Music Latent Representations Learning Method Based on a Deep 3D Convolutional Denoising Autoencoder for Music Genre Classification," Mathematics, MDPI, vol. 9(18), pages 1-17, September.
    18. Zhou, Yuekuan & Zheng, Siqian, 2020. "Climate adaptive optimal design of an aerogel glazing system with the integration of a heuristic teaching-learning-based algorithm in machine learning-based optimization," Renewable Energy, Elsevier, vol. 153(C), pages 375-391.
    19. Kevin Kuo & Daniel Lupton, 2020. "Towards Explainability of Machine Learning Models in Insurance Pricing," Papers 2003.10674, arXiv.org.
    20. Akimoto, Youhei & Auger, Anne & Hansen, Nikolaus, 2022. "An ODE method to prove the geometric convergence of adaptive stochastic algorithms," Stochastic Processes and their Applications, Elsevier, vol. 145(C), pages 269-307.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jrisks:v:7:y:2019:i:2:p:70-:d:241617. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.