IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2505.04553.html
   My bibliography  Save this paper

Risk-sensitive Reinforcement Learning Based on Convex Scoring Functions

Author

Listed:
  • Shanyu Han
  • Yang Liu
  • Xiang Yu

Abstract

We propose a reinforcement learning (RL) framework under a broad class of risk objectives, characterized by convex scoring functions. This class covers many common risk measures, such as variance, Expected Shortfall, entropic Value-at-Risk, and mean-risk utility. To resolve the time-inconsistency issue, we consider an augmented state space and an auxiliary variable and recast the problem as a two-state optimization problem. We propose a customized Actor-Critic algorithm and establish some theoretical approximation guarantees. A key theoretical contribution is that our results do not require the Markov decision process to be continuous. Additionally, we propose an auxiliary variable sampling method inspired by the alternating minimization algorithm, which is convergent under certain conditions. We validate our approach in simulation experiments with a financial application in statistical arbitrage trading, demonstrating the effectiveness of the algorithm.

Suggested Citation

  • Shanyu Han & Yang Liu & Xiang Yu, 2025. "Risk-sensitive Reinforcement Learning Based on Convex Scoring Functions," Papers 2505.04553, arXiv.org, revised May 2025.
  • Handle: RePEc:arx:papers:2505.04553
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2505.04553
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. R. Rockafellar & Stan Uryasev & Michael Zabarankin, 2006. "Generalized deviations in risk analysis," Finance and Stochastics, Springer, vol. 10(1), pages 51-74, January.
    2. Gneiting, Tilmann, 2011. "Making and Evaluating Point Forecasts," Journal of the American Statistical Association, American Statistical Association, vol. 106(494), pages 746-762.
    3. Rafael M Frongillo & Ian A Kash, 2021. "Elicitation complexity of statistical properties [A characterization of scoring rules for linear properties]," Biometrika, Biometrika Trust, vol. 108(4), pages 857-879.
    4. Yanwei Jia, 2024. "Continuous-time Risk-sensitive Reinforcement Learning via Quadratic Variation Penalty," Papers 2404.12598, arXiv.org.
    5. Alexander J. McNeil & Rüdiger Frey & Paul Embrechts, 2015. "Quantitative Risk Management: Concepts, Techniques and Tools Revised edition," Economics Books, Princeton University Press, edition 2, number 10496.
    6. Fissler, Tobias & Pesenti, Silvana M., 2023. "Sensitivity measures based on scoring functions," European Journal of Operational Research, Elsevier, vol. 307(3), pages 1408-1423.
    7. Cheridito, Patrick & Stadje, Mitja, 2009. "Time-inconsistency of VaR and time-consistent alternatives," Finance Research Letters, Elsevier, vol. 6(1), pages 40-46, March.
    8. Haoran Wang & Xun Yu Zhou, 2020. "Continuous‐time mean–variance portfolio selection: A reinforcement learning framework," Mathematical Finance, Wiley Blackwell, vol. 30(4), pages 1273-1308, October.
    9. Righi, Marcelo Brutti & Müller, Fernanda Maria & Moresco, Marlon Ruoso, 2025. "A risk measurement approach from risk-averse stochastic optimization of score functions," Insurance: Mathematics and Economics, Elsevier, vol. 120(C), pages 42-50.
    10. Nicole Bäuerle & Jonathan Ott, 2011. "Markov Decision Processes with Average-Value-at-Risk criteria," Mathematical Methods of Operations Research, Springer;Gesellschaft für Operations Research (GOR);Nederlands Genootschap voor Besliskunde (NGB), vol. 74(3), pages 361-379, December.
    11. Tobias Fissler & Fangda Liu & Ruodu Wang & Linxiao Wei, 2024. "Elicitability and identifiability of tail risk measures," Papers 2404.14136, arXiv.org, revised Jun 2024.
    12. Ben Hambly & Renyuan Xu & Huining Yang, 2021. "Recent Advances in Reinforcement Learning in Finance," Papers 2112.04553, arXiv.org, revised Feb 2023.
    13. Bäuerle, Nicole & Glauner, Alexander, 2022. "Markov decision processes with recursive risk measures," European Journal of Operational Research, Elsevier, vol. 296(3), pages 953-966.
    14. Fabio Bellini & Valeria Bignozzi, 2015. "On elicitable risk measures," Quantitative Finance, Taylor & Francis Journals, vol. 15(5), pages 725-733, May.
    15. Yuyu Chen & Peng Liu & Yang Liu & Ruodu Wang, 2022. "Ordering and inequalities for mixtures on risk aggregation," Mathematical Finance, Wiley Blackwell, vol. 32(1), pages 421-451, January.
    16. Aharon Ben‐Tal & Marc Teboulle, 2007. "An Old‐New Concept Of Convex Risk Measures: The Optimized Certainty Equivalent," Mathematical Finance, Wiley Blackwell, vol. 17(3), pages 449-476, July.
    17. Ben Hambly & Renyuan Xu & Huining Yang, 2023. "Recent advances in reinforcement learning in finance," Mathematical Finance, Wiley Blackwell, vol. 33(3), pages 437-503, July.
    18. Tolulope Fadina & Yang Liu & Ruodu Wang, 2024. "A framework for measures of risk under uncertainty," Finance and Stochastics, Springer, vol. 28(2), pages 363-390, April.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Ruodu Wang & Yunran Wei, 2020. "Risk functionals with convex level sets," Mathematical Finance, Wiley Blackwell, vol. 30(4), pages 1337-1367, October.
    2. Nicole Bäuerle & Anna Jaśkiewicz, 2024. "Markov decision processes with risk-sensitive criteria: an overview," Mathematical Methods of Operations Research, Springer;Gesellschaft für Operations Research (GOR);Nederlands Genootschap voor Besliskunde (NGB), vol. 99(1), pages 141-178, April.
    3. Tadese, Mekonnen & Drapeau, Samuel, 2020. "Relative bound and asymptotic comparison of expectile with respect to expected shortfall," Insurance: Mathematics and Economics, Elsevier, vol. 93(C), pages 387-399.
    4. Tobias Fissler & Fangda Liu & Ruodu Wang & Linxiao Wei, 2024. "Elicitability and identifiability of tail risk measures," Papers 2404.14136, arXiv.org, revised Jun 2024.
    5. Righi, Marcelo Brutti & Müller, Fernanda Maria & Moresco, Marlon Ruoso, 2025. "A risk measurement approach from risk-averse stochastic optimization of score functions," Insurance: Mathematics and Economics, Elsevier, vol. 120(C), pages 42-50.
    6. Paul Embrechts & Tiantian Mao & Qiuqi Wang & Ruodu Wang, 2021. "Bayes risk, elicitability, and the Expected Shortfall," Mathematical Finance, Wiley Blackwell, vol. 31(4), pages 1190-1217, October.
    7. Samuel Drapeau & Mekonnen Tadese, 2019. "Relative Bound and Asymptotic Comparison of Expectile with Respect to Expected Shortfall," Papers 1906.09729, arXiv.org, revised Jun 2020.
    8. Tobias Fissler & Yannick Hoga, 2024. "How to Compare Copula Forecasts?," Papers 2410.04165, arXiv.org.
    9. Bellini, Fabio & Klar, Bernhard & Müller, Alfred & Rosazza Gianin, Emanuela, 2014. "Generalized quantiles as risk measures," Insurance: Mathematics and Economics, Elsevier, vol. 54(C), pages 41-48.
    10. Xia Han & Liyuan Lin & Ruodu Wang, 2022. "Diversification quotients: Quantifying diversification via risk measures," Papers 2206.13679, arXiv.org, revised Jul 2024.
    11. Marie Kratz & Yen H Lok & Alexander J Mcneil, 2016. "Multinomial var backtests: A simple implicit approach to backtesting expected shortfall," Working Papers hal-01424279, HAL.
    12. Xia Han & Liyuan Lin & Ruodu Wang, 2023. "Diversification quotients based on VaR and ES," Papers 2301.03517, arXiv.org, revised May 2023.
    13. Han, Xia & Lin, Liyuan & Wang, Ruodu, 2023. "Diversification quotients based on VaR and ES," Insurance: Mathematics and Economics, Elsevier, vol. 113(C), pages 185-197.
    14. Edgars Jakobsons & Steven Vanduffel, 2015. "Dependence Uncertainty Bounds for the Expectile of a Portfolio," Risks, MDPI, vol. 3(4), pages 1-25, December.
    15. Mohammed Berkhouch & Fernanda Maria Müller & Ghizlane Lakhnati & Marcelo Brutti Righi, 2022. "Deviation-Based Model Risk Measures," Computational Economics, Springer;Society for Computational Economics, vol. 59(2), pages 527-547, February.
    16. Tobias Fissler & Jana Hlavinová & Birgit Rudloff, 2021. "Elicitability and identifiability of set-valued measures of systemic risk," Finance and Stochastics, Springer, vol. 25(1), pages 133-165, January.
    17. Werner Ehm & Tilmann Gneiting & Alexander Jordan & Fabian Krüger, 2016. "Of quantiles and expectiles: consistent scoring functions, Choquet representations and forecast rankings," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 78(3), pages 505-562, June.
    18. Xiangyu Cui & Xun Li & Yun Shi & Si Zhao, 2023. "Discrete-Time Mean-Variance Strategy Based on Reinforcement Learning," Papers 2312.15385, arXiv.org.
    19. Silvana M. Pesenti & Steven Vanduffel, 2023. "Optimal Transport Divergences induced by Scoring Functions," Papers 2311.12183, arXiv.org, revised Apr 2024.
    20. Horikawa, Hiroaki & Nakagawa, Kei, 2024. "Relationship between deep hedging and delta hedging: Leveraging a statistical arbitrage strategy," Finance Research Letters, Elsevier, vol. 62(PA).

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2505.04553. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.