IDEAS home Printed from https://ideas.repec.org/a/vrs/demode/v6y2018i1p377-407n22.html
   My bibliography  Save this article

Predictive analytics of insurance claims using multivariate decision trees

Author

Listed:
  • Quan Zhiyu

    (Department of Mathematics, University of Connecticut, Mansfield,Connecticut, USA)

  • Valdez Emiliano A.

    (Department of Mathematics, University of Connecticut, Mansfield,Conneticut, USA)

Abstract

Because of its many advantages, the use of decision trees has become an increasingly popular alternative predictive tool for building classification and regression models. Its origins date back for about five decades where the algorithm can be broadly described by repeatedly partitioning the regions of the explanatory variables and thereby creating a tree-based model for predicting the response. Innovations to the original methods, such as random forests and gradient boosting, have further improved the capabilities of using decision trees as a predictive model. In addition, the extension of using decision trees with multivariate response variables started to develop and it is the purpose of this paper to apply multivariate tree models to insurance claims data with correlated responses. This extension to multivariate response variables inherits several advantages of the univariate decision tree models such as distribution-free feature, ability to rank essential explanatory variables, and high predictive accuracy, to name a few. To illustrate the approach, we analyze a dataset drawn from the Wisconsin Local Government Property Insurance Fund (LGPIF)which offers multi-line insurance coverage of property, motor vehicle, and contractors’ equipments.With multivariate tree models, we are able to capture the inherent relationship among the response variables and we find that the marginal predictive model based on multivariate trees is an improvement in prediction accuracy from that based on simply the univariate trees.

Suggested Citation

  • Quan Zhiyu & Valdez Emiliano A., 2018. "Predictive analytics of insurance claims using multivariate decision trees," Dependence Modeling, De Gruyter, vol. 6(1), pages 377-407, December.
  • Handle: RePEc:vrs:demode:v:6:y:2018:i:1:p:377-407:n:22
    DOI: 10.1515/demo-2018-0022
    as

    Download full text from publisher

    File URL: https://doi.org/10.1515/demo-2018-0022
    Download Restriction: no

    File URL: https://libkey.io/10.1515/demo-2018-0022?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Friedman, Jerome H., 2002. "Stochastic gradient boosting," Computational Statistics & Data Analysis, Elsevier, vol. 38(4), pages 367-378, February.
    2. Wei-Yin Loh, 2014. "Fifty Years of Classification and Regression Trees," International Statistical Review, International Statistical Institute, vol. 82(3), pages 329-348, December.
    3. Peng Shi & Lu Yang, 2018. "Pair Copula Constructions for Insurance Experience Rating," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 113(521), pages 122-133, January.
    4. Edward W. Frees & Gee Lee & Lu Yang, 2016. "Multivariate Frequency-Severity Regression Models in Insurance," Risks, MDPI, vol. 4(1), pages 1-36, February.
    5. Frees, Edward W. & Valdez, Emiliano A., 2008. "Hierarchical Insurance Claims Modeling," Journal of the American Statistical Association, American Statistical Association, vol. 103(484), pages 1457-1469.
    6. Philippe Deprez & Pavel V. Shevchenko & Mario V. Wuthrich, 2017. "Machine Learning Techniques for Mortality Modeling," Papers 1705.03396, arXiv.org.
    7. Simon C. K. Lee & Sheldon Lin, 2018. "Delta Boosting Machine with Application to General Insurance," North American Actuarial Journal, Taylor & Francis Journals, vol. 22(3), pages 405-425, July.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Christopher Blier-Wong & Hélène Cossette & Luc Lamontagne & Etienne Marceau, 2020. "Machine Learning in P&C Insurance: A Review for Pricing and Reserving," Risks, MDPI, vol. 9(1), pages 1-26, December.
    2. Yves Staudt & Joël Wagner, 2021. "Assessing the Performance of Random Forests for Modeling Claim Severity in Collision Car Insurance," Risks, MDPI, vol. 9(3), pages 1-28, March.
    3. Emer Owens & Barry Sheehan & Martin Mullins & Martin Cunneen & Juliane Ressel & German Castignani, 2022. "Explainable Artificial Intelligence (XAI) in Insurance," Risks, MDPI, vol. 10(12), pages 1-50, December.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Oh, Rosy & Lee, Youngju & Zhu, Dan & Ahn, Jae Youn, 2021. "Predictive risk analysis using a collective risk model: Choosing between past frequency and aggregate severity information," Insurance: Mathematics and Economics, Elsevier, vol. 96(C), pages 127-139.
    2. Lu Yang & Claudia Czado, 2022. "Two‐part D‐vine copula models for longitudinal insurance claim data," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 49(4), pages 1534-1561, December.
    3. Massimo Costabile & Fabio Viviano, 2021. "Modeling the Future Value Distribution of a Life Insurance Portfolio," Risks, MDPI, vol. 9(10), pages 1-17, October.
    4. Pechon, Florian & Denuit, Michel & Trufin, Julien, 2019. "Home and Motor insurance joined at a household level using multivariate credibility," LIDAM Discussion Papers ISBA 2019013, Université catholique de Louvain, Institute of Statistics, Biostatistics and Actuarial Sciences (ISBA).
    5. Eling, Martin & Jung, Kwangmin, 2018. "Copula approaches for modeling cross-sectional dependence of data breach losses," Insurance: Mathematics and Economics, Elsevier, vol. 82(C), pages 167-180.
    6. Tzougas, George & Jeong, Himchan, 2021. "An expectation-maximization algorithm for the exponential-generalized inverse Gaussian regression model with varying dispersion and shape for modelling the aggregate claim amount," LSE Research Online Documents on Economics 108210, London School of Economics and Political Science, LSE Library.
    7. George Tzougas & Himchan Jeong, 2021. "An Expectation-Maximization Algorithm for the Exponential-Generalized Inverse Gaussian Regression Model with Varying Dispersion and Shape for Modelling the Aggregate Claim Amount," Risks, MDPI, vol. 9(1), pages 1-17, January.
    8. Kaiwen Wang & Jiehui Ding & Kristen R. Lidwell & Scott Manski & Gee Y. Lee & Emilio Xavier Esposito, 2019. "Treatment Level and Store Level Analyses of Healthcare Data," Risks, MDPI, vol. 7(2), pages 1-22, April.
    9. Emilio Carrizosa & Cristina Molero-Río & Dolores Romero Morales, 2021. "Mathematical optimization in classification and regression trees," TOP: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 29(1), pages 5-33, April.
    10. Christophe Dutang & Quentin Guibert, 2021. "An explicit split point procedure in model-based trees allowing for a quick fitting of GLM trees and GLM forests," Post-Print hal-03448250, HAL.
    11. Oh, Rosy & Jeong, Himchan & Ahn, Jae Youn & Valdez, Emiliano A., 2021. "A multi-year microlevel collective risk model," Insurance: Mathematics and Economics, Elsevier, vol. 100(C), pages 309-328.
    12. Cheung, Eric C.K. & Ni, Weihong & Oh, Rosy & Woo, Jae-Kyung, 2021. "Bayesian credibility under a bivariate prior on the frequency and the severity of claims," Insurance: Mathematics and Economics, Elsevier, vol. 100(C), pages 274-295.
    13. Safari-Katesari Hadi & Zaroudi Samira, 2020. "Count copula regression model using generalized beta distribution of the second kind," Statistics in Transition New Series, Polish Statistical Association, vol. 21(2), pages 1-12, June.
    14. Hadi Safari-Katesari & Samira Zaroudi, 2020. "Count copula regression model using generalized beta distribution of the second kind," Statistics in Transition New Series, Polish Statistical Association, vol. 21(2), pages 1-12, June.
    15. Linwei Hu & Jie Chen & Joel Vaughan & Soroush Aramideh & Hanyu Yang & Kelly Wang & Agus Sudjianto & Vijayan N. Nair, 2021. "Supervised Machine Learning Techniques: An Overview with Applications to Banking," International Statistical Review, International Statistical Institute, vol. 89(3), pages 573-604, December.
    16. Verschuren, Robert Matthijs, 2022. "Frequency-severity experience rating based on latent Markovian risk profiles," Insurance: Mathematics and Economics, Elsevier, vol. 107(C), pages 379-392.
    17. Emer Owens & Barry Sheehan & Martin Mullins & Martin Cunneen & Juliane Ressel & German Castignani, 2022. "Explainable Artificial Intelligence (XAI) in Insurance," Risks, MDPI, vol. 10(12), pages 1-50, December.
    18. Mansoor, Umer & Jamal, Arshad & Su, Junbiao & Sze, N.N. & Chen, Anthony, 2023. "Investigating the risk factors of motorcycle crash injury severity in Pakistan: Insights and policy recommendations," Transport Policy, Elsevier, vol. 139(C), pages 21-38.
    19. Chenglong Ye & Lin Zhang & Mingxuan Han & Yanjia Yu & Bingxin Zhao & Yuhong Yang, 2022. "Combining Predictions of Auto Insurance Claims," Econometrics, MDPI, vol. 10(2), pages 1-15, April.
    20. Bissan Ghaddar & Ignacio Gómez-Casares & Julio González-Díaz & Brais González-Rodríguez & Beatriz Pateiro-López & Sofía Rodríguez-Ballesteros, 2023. "Learning for Spatial Branching: An Algorithm Selection Approach," INFORMS Journal on Computing, INFORMS, vol. 35(5), pages 1024-1043, September.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:vrs:demode:v:6:y:2018:i:1:p:377-407:n:22. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Peter Golla (email available below). General contact details of provider: https://www.degruyter.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.