IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2601.07637.html

Reinforcement Learning for Micro-Level Claims Reserving

Author

Listed:
  • Benjamin Avanzi
  • Ronald Richman
  • Bernard Wong
  • Mario Wuthrich
  • Yagebu Xie

Abstract

Outstanding claim liabilities are revised repeatedly as claims develop, yet most modern reserving models are trained as one-shot predictors and typically learn only from settled claims. We formulate individual claims reserving as a claim-level Markov decision process in which an agent sequentially updates outstanding claim liability (OCL) estimates over development, using continuous actions and a reward design that balances accuracy with stable reserve revisions. A key advantage of this reinforcement learning (RL) approach is that it can learn from all observed claim trajectories, including claims that remain open at valuation, thereby avoiding the reduced sample size and selection effects inherent in supervised methods trained on ultimate outcomes only. We also introduce practical components needed for actuarial use -- initialisation of new claims, temporally consistent tuning via a rolling-settlement scheme, and an importance-weighting mechanism to mitigate portfolio-level underestimation driven by the rarity of large claims. On CAS and SPLICE synthetic general insurance datasets, the proposed Soft Actor-Critic implementation delivers competitive claim-level accuracy and strong aggregate OCL performance, particularly for the immature claim segments that drive most of the liability.

Suggested Citation

  • Benjamin Avanzi & Ronald Richman & Bernard Wong & Mario Wuthrich & Yagebu Xie, 2026. "Reinforcement Learning for Micro-Level Claims Reserving," Papers 2601.07637, arXiv.org.
  • Handle: RePEc:arx:papers:2601.07637
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2601.07637
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Kevin Kuo, 2019. "DeepTriangle: A Deep Learning Approach to Loss Reserving," Risks, MDPI, vol. 7(3), pages 1-12, September.
    2. Michel Denuit & Arthur Charpentier & Julien Trufin, 2021. "Autocalibration and Tweedie-dominance for Insurance Pricing with Machine Learning," Papers 2103.03635, arXiv.org, revised Jul 2021.
    3. Avanzi, Benjamin & Taylor, Greg & Wang, Melantha & Wong, Bernard, 2021. "SynthETIC: An individual insurance claim simulator with feature control," Insurance: Mathematics and Economics, Elsevier, vol. 100(C), pages 296-308.
    4. Benjamin Avanzi & Matthew Lambrianidis & Greg Taylor & Bernard Wong, 2025. "On the use of case estimate and transactional payment data in neural networks for individual loss reserving," Papers 2601.05274, arXiv.org.
    5. Ben Hambly & Renyuan Xu & Huining Yang, 2021. "Recent Advances in Reinforcement Learning in Finance," Papers 2112.04553, arXiv.org, revised Feb 2023.
    6. Kevin Kuo, 2018. "DeepTriangle: A Deep Learning Approach to Loss Reserving," Papers 1804.09253, arXiv.org, revised Sep 2019.
    7. Norberg, Ragnar, 1993. "Prediction of Outstanding Liabilities in Non-Life Insurance1," ASTIN Bulletin, Cambridge University Press, vol. 23(1), pages 95-115, May.
    8. Denuit, Michel & Charpentier, Arthur & Trufin, Julien, 2021. "Autocalibration and Tweedie-dominance for insurance pricing with machine learning," LIDAM Discussion Papers ISBA 2021013, Université catholique de Louvain, Institute of Statistics, Biostatistics and Actuarial Sciences (ISBA).
    9. Denuit, Michel & Charpentier , Arthur & Trufin, Julien, 2021. "Autocalibration and Tweedie-dominance for insurance pricing with machine learning," LIDAM Reprints ISBA 2021049, Université catholique de Louvain, Institute of Statistics, Biostatistics and Actuarial Sciences (ISBA).
    10. England, P.D. & Verrall, R.J., 2002. "Stochastic Claims Reserving in General Insurance," British Actuarial Journal, Cambridge University Press, vol. 8(3), pages 443-518, August.
    11. Francis Duval & Mathieu Pigeon, 2019. "Individual Loss Reserving Using a Gradient Boosting-Based Approach," Risks, MDPI, vol. 7(3), pages 1-18, July.
    12. Al-Mudafer, Muhammed Taher & Avanzi, Benjamin & Taylor, Greg & Wong, Bernard, 2022. "Stochastic loss reserving with mixture density neural networks," Insurance: Mathematics and Economics, Elsevier, vol. 105(C), pages 144-174.
    13. Maximilien Baudry & Christian Y. Robert, 2019. "A machine learning approach for individual claims reserving in insurance," Applied Stochastic Models in Business and Industry, John Wiley & Sons, vol. 35(5), pages 1127-1155, September.
    14. Denuit, Michel & Charpentier, Arthur & Trufin, Julien, 2021. "Autocalibration and Tweedie-dominance for insurance pricing with machine learning," Insurance: Mathematics and Economics, Elsevier, vol. 101(PB), pages 485-497.
    15. Palmborg, Lina & Lindskog, Filip, 2023. "Premium control with reinforcement learning," ASTIN Bulletin, Cambridge University Press, vol. 53(2), pages 233-257, May.
    16. Benjamin Avanzi & Gregory Clive Taylor & Melantha Wang & Bernard Wong, 2020. "SynthETIC: an individual insurance claim simulator with feature control," Papers 2008.05693, arXiv.org, revised Aug 2021.
    17. Tashman, Leonard J., 2000. "Out-of-sample tests of forecasting accuracy: an analysis and review," International Journal of Forecasting, Elsevier, vol. 16(4), pages 437-450.
    18. Rockafellar, R. Tyrrell & Uryasev, Stanislav, 2002. "Conditional value-at-risk for general loss distributions," Journal of Banking & Finance, Elsevier, vol. 26(7), pages 1443-1471, July.
    19. Chong, Wing Fung & Cui, Haoen & Li, Yuxuan, 2023. "Pseudo-model-free hedging for variable annuities via deep reinforcement learning," Annals of Actuarial Science, Cambridge University Press, vol. 17(3), pages 503-546, November.
    20. Norberg, Ragnar, 1999. "Prediction of Outstanding Liabilities II. Model Variations and Extensions," ASTIN Bulletin, Cambridge University Press, vol. 29(1), pages 5-25, May.
    21. Ben Hambly & Renyuan Xu & Huining Yang, 2023. "Recent advances in reinforcement learning in finance," Mathematical Finance, Wiley Blackwell, vol. 33(3), pages 437-503, July.
    22. Asmussen, Soren & Taksar, Michael, 1997. "Controlled diffusion models for optimal dividend pay-out," Insurance: Mathematics and Economics, Elsevier, vol. 20(1), pages 1-15, June.
    23. Arjas, Elja, 1989. "The Claims Reserving Problem in Non-Life Insurance: Some Structural Ideas," ASTIN Bulletin, Cambridge University Press, vol. 19(2), pages 139-152, November.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Ronald Richman & Mario V. Wuthrich, 2026. "From Chain-Ladder to Individual Claims Reserving," Papers 2602.15385, arXiv.org, revised Feb 2026.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Benjamin Avanzi & Matthew Lambrianidis & Greg Taylor & Bernard Wong, 2025. "On the use of case estimate and transactional payment data in neural networks for individual loss reserving," Papers 2601.05274, arXiv.org.
    2. Jan Janoušek & Michal Pešta, 2025. "Bagging and regression trees in individual claims reserving," Statistical Papers, Springer, vol. 66(4), pages 1-26, June.
    3. Łukasz Delong & Mario V. Wüthrich, 2020. "Neural Networks for the Joint Development of Individual Payments and Claim Incurred," Risks, MDPI, vol. 8(2), pages 1-34, April.
    4. Stephan M. Bischofberger, 2020. "In-Sample Hazard Forecasting Based on Survival Models with Operational Time," Risks, MDPI, vol. 8(1), pages 1-17, January.
    5. Maciak, Matúš & Okhrin, Ostap & Pešta, Michal, 2021. "Infinitely stochastic micro reserving," Insurance: Mathematics and Economics, Elsevier, vol. 100(C), pages 30-58.
    6. Avanzi, Benjamin & Taylor, Greg & Wang, Melantha & Wong, Bernard, 2021. "SynthETIC: An individual insurance claim simulator with feature control," Insurance: Mathematics and Economics, Elsevier, vol. 100(C), pages 296-308.
    7. Benjamin Avanzi & Yanfeng Li & Bernard Wong & Alan Xian, 2022. "Ensemble distributional forecasting for insurance loss reserving," Papers 2206.08541, arXiv.org, revised Jun 2024.
    8. Muhammed Taher Al-Mudafer & Benjamin Avanzi & Greg Taylor & Bernard Wong, 2021. "Stochastic loss reserving with mixture density neural networks," Papers 2108.07924, arXiv.org.
    9. Perla, Francesca & Scognamiglio, Salvatore & Spadaro, Andrea & Zanetti, Paolo, 2025. "Transformers-based least square Monte Carlo for solvency calculation in life insurance," Insurance: Mathematics and Economics, Elsevier, vol. 125(C).
    10. Emmanuel Jordy Menvouta & Jolien Ponnet & Robin Van Oirbeek & Tim Verdonck, 2022. "mCube: Multinomial Micro-level reserving Model," Papers 2212.00101, arXiv.org.
    11. Denuit, Michel & Trufin, Julien, 2022. "Autocalibration by balance correction in nonlife insurance pricing," LIDAM Discussion Papers ISBA 2022041, Université catholique de Louvain, Institute of Statistics, Biostatistics and Actuarial Sciences (ISBA).
    12. Francis Duval & Mathieu Pigeon, 2019. "Individual Loss Reserving Using a Gradient Boosting-Based Approach," Risks, MDPI, vol. 7(3), pages 1-18, July.
    13. Denuit, Michel & Trufin, Julien, 2022. "Tweedie dominance for autocalibrated predictors and Laplace transform order," LIDAM Discussion Papers ISBA 2022040, Université catholique de Louvain, Institute of Statistics, Biostatistics and Actuarial Sciences (ISBA).
    14. Mario V. Wuthrich & Johanna Ziegel, 2023. "Isotonic Recalibration under a Low Signal-to-Noise Ratio," Papers 2301.02692, arXiv.org.
    15. Hainaut, Donatien, 2025. "In-processing of actuarial and equity fairness constraints for Neural networks," LIDAM Discussion Papers ISBA 2025011, Université catholique de Louvain, Institute of Statistics, Biostatistics and Actuarial Sciences (ISBA).
    16. Mat'uv{s} Maciak & Ostap Okhrin & Michal Pev{s}ta, 2019. "Infinitely Stochastic Micro Forecasting," Papers 1908.10636, arXiv.org, revised Sep 2019.
    17. Yaojun Zhang & Lanpeng Ji & Georgios Aivaliotis & Charles Taylor, 2023. "Bayesian CART models for insurance claims frequency," Papers 2303.01923, arXiv.org, revised Dec 2023.
    18. Jamotton, Charlotte & Hainaut, Donatien, 2025. "A multivariate energy-based fairness adjuster for premiums," LIDAM Discussion Papers ISBA 2025009, Université catholique de Louvain, Institute of Statistics, Biostatistics and Actuarial Sciences (ISBA).
    19. Mat'uv{s} Maciak & Ostap Okhrin & Michal Pev{s}ta, 2018. "Dynamic and granular loss reserving with copulae," Papers 1801.01792, arXiv.org.
    20. Denuit, Michel & Huyghe, Julie & Trufin, Julien & Verdebout, Thomas, 2024. "Testing for auto-calibration with Lorenz and Concentration curves," Insurance: Mathematics and Economics, Elsevier, vol. 117(C), pages 130-139.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2601.07637. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.