IDEAS home Printed from https://ideas.repec.org/a/gam/jrisks/v7y2019i2p70-d241617.html
   My bibliography  Save this article

Predicting Motor Insurance Claims Using Telematics Data—XGBoost versus Logistic Regression

Author

Listed:
  • Jessica Pesantez-Narvaez

    (Department of Econometrics, Riskcenter-IREA, Universitat de Barcelona, 08034 Barcelona, Spain)

  • Montserrat Guillen

    (Department of Econometrics, Riskcenter-IREA, Universitat de Barcelona, 08034 Barcelona, Spain)

  • Manuela Alcañiz

    (Department of Econometrics, Riskcenter-IREA, Universitat de Barcelona, 08034 Barcelona, Spain)

Abstract

XGBoost is recognized as an algorithm with exceptional predictive capacity. Models for a binary response indicating the existence of accident claims versus no claims can be used to identify the determinants of traffic accidents. This study compared the relative performances of logistic regression and XGBoost approaches for predicting the existence of accident claims using telematics data. The dataset contained information from an insurance company about the individuals’ driving patterns—including total annual distance driven and percentage of total distance driven in urban areas. Our findings showed that logistic regression is a suitable model given its interpretability and good predictive capacity. XGBoost requires numerous model-tuning procedures to match the predictive performance of the logistic regression model and greater effort as regards to interpretation.

Suggested Citation

  • Jessica Pesantez-Narvaez & Montserrat Guillen & Manuela Alcañiz, 2019. "Predicting Motor Insurance Claims Using Telematics Data—XGBoost versus Logistic Regression," Risks, MDPI, vol. 7(2), pages 1-16, June.
  • Handle: RePEc:gam:jrisks:v:7:y:2019:i:2:p:70-:d:241617
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2227-9091/7/2/70/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2227-9091/7/2/70/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Roel Verbelen & Katrien Antonio & Gerda Claeskens, 2018. "Unravelling the predictive power of telematics data in car insurance pricing," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 67(5), pages 1275-1304, November.
    2. Guangyuan Gao & Mario V. Wüthrich, 2019. "Convolutional Neural Network Classification of Telematics Car Driving Data," Risks, MDPI, vol. 7(1), pages 1-18, January.
    3. Pieter-Tjerk de Boer & Dirk Kroese & Shie Mannor & Reuven Rubinstein, 2005. "A Tutorial on the Cross-Entropy Method," Annals of Operations Research, Springer, vol. 134(1), pages 19-67, February.
    4. Hultkrantz, Lars & Nilsson, Jan-Eric & Arvidsson, Sara, 2012. "Voluntary internalization of speeding externalities with vehicle insurance," Transportation Research Part A: Policy and Practice, Elsevier, vol. 46(6), pages 926-937.
    5. Simon C. K. Lee & Sheldon Lin, 2018. "Delta Boosting Machine with Application to General Insurance," North American Actuarial Journal, Taylor & Francis Journals, vol. 22(3), pages 405-425, July.
    6. Jianhua Z. Huang & Lijian Yang, 2004. "Identification of non‐linear additive autoregressive models," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 66(2), pages 463-477, May.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Jessica Pesantez-Narvaez & Montserrat Guillen & Manuela Alcañiz, 2021. "A Synthetic Penalized Logitboost to Model Mortgage Lending with Imbalanced Data," Computational Economics, Springer;Society for Computational Economics, vol. 57(1), pages 281-309, January.
    2. Francis Duval & Jean‐Philippe Boucher & Mathieu Pigeon, 2023. "Enhancing claim classification with feature extraction from anomaly‐detection‐derived routine and peculiarity profiles," Journal of Risk & Insurance, The American Risk and Insurance Association, vol. 90(2), pages 421-458, June.
    3. Zhiyu Quan & Changyue Hu & Panyi Dong & Emiliano A. Valdez, 2024. "Improving Business Insurance Loss Models by Leveraging InsurTech Innovation," Papers 2401.16723, arXiv.org.
    4. Nelson Kemboi Yego & Juma Kasozi & Joseph Nkurunziza, 2021. "A Comparative Analysis of Machine Learning Models for the Prediction of Insurance Uptake in Kenya," Data, MDPI, vol. 6(11), pages 1-17, November.
    5. Thomas Poufinas & Periklis Gogas & Theophilos Papadimitriou & Emmanouil Zaganidis, 2023. "Machine Learning in Forecasting Motor Insurance Claims," Risks, MDPI, vol. 11(9), pages 1-19, September.
    6. Viktor Stojkoski & Petar Jolakoski & Igor Ivanovski, 2021. "The short‐run impact of COVID‐19 on the activity in the insurance industry in the Republic of North Macedonia," Risk Management and Insurance Review, American Risk and Insurance Association, vol. 24(3), pages 221-242, September.
    7. Meng, Shengwang & Gao, Yaqian & Huang, Yifan, 2022. "Actuarial intelligence in auto insurance: Claim frequency modeling with driving behavior features and improved boosted trees," Insurance: Mathematics and Economics, Elsevier, vol. 106(C), pages 115-127.
    8. Trufin, Julien & Denuit, Michel, 2021. "Boosting cost-complexity pruned trees On Tweedie responses: the ABT machine," LIDAM Discussion Papers ISBA 2021015, Université catholique de Louvain, Institute of Statistics, Biostatistics and Actuarial Sciences (ISBA).
    9. Nemanja Milanović & Miloš Milosavljević & Slađana Benković & Dušan Starčević & Željko Spasenić, 2020. "An Acceptance Approach for Novel Technologies in Car Insurance," Sustainability, MDPI, vol. 12(24), pages 1-15, December.
    10. Zuleyka Díaz Martínez & José Fernández Menéndez & Luis Javier García Villalba, 2023. "Tariff Analysis in Automobile Insurance: Is It Time to Switch from Generalized Linear Models to Generalized Additive Models?," Mathematics, MDPI, vol. 11(18), pages 1-16, September.
    11. Jessica Pesantez-Narvaez & Montserrat Guillen & Manuela Alcañiz, 2021. "RiskLogitboost Regression for Rare Events in Binary Response: An Econometric Approach," Mathematics, MDPI, vol. 9(5), pages 1-21, March.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Christopher Blier-Wong & Hélène Cossette & Luc Lamontagne & Etienne Marceau, 2020. "Machine Learning in P&C Insurance: A Review for Pricing and Reserving," Risks, MDPI, vol. 9(1), pages 1-26, December.
    2. Donatella Porrini & Giulio Fusco & Cosimo Magazzino, 2020. "Black boxes and market efficiency: the effect on premiums in the Italian motor-vehicle insurance market," European Journal of Law and Economics, Springer, vol. 49(3), pages 455-472, June.
    3. Zhiyu Quan & Changyue Hu & Panyi Dong & Emiliano A. Valdez, 2024. "Improving Business Insurance Loss Models by Leveraging InsurTech Innovation," Papers 2401.16723, arXiv.org.
    4. Shengkun Xie, 2021. "Improving Explainability of Major Risk Factors in Artificial Neural Networks for Auto Insurance Rate Regulation," Risks, MDPI, vol. 9(7), pages 1-21, July.
    5. Xiong, Wei & Wang, Dehui & Deng, Dianliang & Wang, Xinyang & Zhang, Wanying, 2022. "Penalized multiply robust estimation in high-order autoregressive processes with missing explanatory variables," Journal of Multivariate Analysis, Elsevier, vol. 187(C).
    6. Yujiao Yang & Qiongxia Song, 2014. "Jump detection in time series nonparametric regression models: a polynomial spline approach," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 66(2), pages 325-344, April.
    7. Montserrat Guillen & Ana M. Pérez-Marín & Mercedes Ayuso & Jens Perch Nielsen, 2018. "“Exposure to risk increases the excess of zero accident claims frequency in automobile insurance”," IREA Working Papers 201810, University of Barcelona, Research Institute of Applied Economics, revised May 2018.
    8. A. Shibu & M. Reddy, 2014. "Optimal Design of Water Distribution Networks Considering Fuzzy Randomness of Demands Using Cross Entropy Optimization," Water Resources Management: An International Journal, Published for the European Water Resources Association (EWRA), Springer;European Water Resources Association (EWRA), vol. 28(12), pages 4075-4094, September.
    9. M Caserta & E Quiñonez Rico, 2009. "A cross entropy-based metaheuristic algorithm for large-scale capacitated facility location problems," Journal of the Operational Research Society, Palgrave Macmillan;The OR Society, vol. 60(10), pages 1439-1448, October.
    10. Deprez, Laurens & Antonio, Katrien & Boute, Robert, 2021. "Pricing service maintenance contracts using predictive analytics," European Journal of Operational Research, Elsevier, vol. 290(2), pages 530-545.
    11. Dementyeva, Maria & Verhoef, Erik T., 2016. "Miles, speed, and technology: Traffic safety under oligopolistic insurance," Transportation Research Part B: Methodological, Elsevier, vol. 86(C), pages 147-162.
    12. Noh, Hohsuk & Lee, Eun, 2012. "Component Selection in Additive Quantile Regression Models," LIDAM Discussion Papers ISBA 2012021, Université catholique de Louvain, Institute of Statistics, Biostatistics and Actuarial Sciences (ISBA).
    13. Shafik, Nivien & Tutz, Gerhard, 2009. "Boosting nonlinear additive autoregressive time series," Computational Statistics & Data Analysis, Elsevier, vol. 53(7), pages 2453-2464, May.
    14. Altiparmak, Fulya & Dengiz, Berna, 2009. "A cross entropy approach to design of reliable networks," European Journal of Operational Research, Elsevier, vol. 199(2), pages 542-552, December.
    15. Mattrand, C. & Bourinet, J.-M., 2014. "The cross-entropy method for reliability assessment of cracked structures subjected to random Markovian loads," Reliability Engineering and System Safety, Elsevier, vol. 123(C), pages 171-182.
    16. Alibrandi, Umberto, 2014. "A response surface method for stochastic dynamic analysis," Reliability Engineering and System Safety, Elsevier, vol. 126(C), pages 44-53.
    17. Nguyen, Hoa T.M. & Chow, Andy H.F. & Ying, Cheng-shuo, 2021. "Pareto routing and scheduling of dynamic urban rail transit services with multi-objective cross entropy method," Transportation Research Part E: Logistics and Transportation Review, Elsevier, vol. 156(C).
    18. R. Y. Rubinstein, 2005. "A Stochastic Minimum Cross-Entropy Method for Combinatorial Optimization and Rare-event Estimation," Methodology and Computing in Applied Probability, Springer, vol. 7(1), pages 5-50, March.
    19. Alfiero, Simona & Battisti, Enrico & Ηadjielias, Elias, 2022. "Black box technology, usage-based insurance, and prediction of purchase behavior: Evidence from the auto insurance sector," Technological Forecasting and Social Change, Elsevier, vol. 183(C).
    20. Kin-Ping Hui, 2011. "Cooperative Cross-Entropy method for generating entangled networks," Annals of Operations Research, Springer, vol. 189(1), pages 205-214, September.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jrisks:v:7:y:2019:i:2:p:70-:d:241617. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.