IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2108.10403.html
   My bibliography  Save this paper

Robust Risk-Aware Reinforcement Learning

Author

Listed:
  • Sebastian Jaimungal
  • Silvana Pesenti
  • Ye Sheng Wang
  • Hariom Tatsat

Abstract

We present a reinforcement learning (RL) approach for robust optimisation of risk-aware performance criteria. To allow agents to express a wide variety of risk-reward profiles, we assess the value of a policy using rank dependent expected utility (RDEU). RDEU allows the agent to seek gains, while simultaneously protecting themselves against downside risk. To robustify optimal policies against model uncertainty, we assess a policy not by its distribution, but rather, by the worst possible distribution that lies within a Wasserstein ball around it. Thus, our problem formulation may be viewed as an actor/agent choosing a policy (the outer problem), and the adversary then acting to worsen the performance of that strategy (the inner problem). We develop explicit policy gradient formulae for the inner and outer problems, and show its efficacy on three prototypical financial problems: robust portfolio allocation, optimising a benchmark, and statistical arbitrage.

Suggested Citation

  • Sebastian Jaimungal & Silvana Pesenti & Ye Sheng Wang & Hariom Tatsat, 2021. "Robust Risk-Aware Reinforcement Learning," Papers 2108.10403, arXiv.org, revised Dec 2021.
  • Handle: RePEc:arx:papers:2108.10403
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2108.10403
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Paul Milgrom & Ilya Segal, 2002. "Envelope Theorems for Arbitrary Choice Sets," Econometrica, Econometric Society, vol. 70(2), pages 583-601, March.
    2. Yaari, Menahem E, 1987. "The Dual Theory of Choice under Risk," Econometrica, Econometric Society, vol. 55(1), pages 95-115, January.
    3. Georg Pflug & David Wozabal, 2007. "Ambiguity in portfolio selection," Quantitative Finance, Taylor & Francis Journals, vol. 7(4), pages 435-442.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Ben Hambly & Renyuan Xu & Huining Yang, 2021. "Recent Advances in Reinforcement Learning in Finance," Papers 2112.04553, arXiv.org, revised Feb 2023.
    2. Christa Cuchiero & Guido Gazzani & Irene Klein, 2022. "Risk measures under model uncertainty: a Bayesian viewpoint," Papers 2204.07115, arXiv.org.
    3. Sebastian Jaimungal, 2022. "Reinforcement learning and stochastic optimisation," Finance and Stochastics, Springer, vol. 26(1), pages 103-129, January.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Chatterjee, Kalyan & Vijay Krishna, R., 2011. "A nonsmooth approach to nonexpected utility theory under risk," Mathematical Social Sciences, Elsevier, vol. 62(3), pages 166-175.
    2. Silvana Pesenti & Sebastian Jaimungal, 2020. "Portfolio Optimisation within a Wasserstein Ball," Papers 2012.04500, arXiv.org, revised Jun 2022.
    3. Xia Han & Ruodu Wang & Xun Yu Zhou, 2022. "Choquet regularization for reinforcement learning," Papers 2208.08497, arXiv.org.
    4. Viet Anh Nguyen & Soroosh Shafiee & Damir Filipovi'c & Daniel Kuhn, 2021. "Mean-Covariance Robust Risk Measurement," Papers 2112.09959, arXiv.org, revised Nov 2023.
    5. Kaido, Hiroaki, 2017. "Asymptotically Efficient Estimation Of Weighted Average Derivatives With An Interval Censored Variable," Econometric Theory, Cambridge University Press, vol. 33(5), pages 1218-1241, October.
    6. Król, Michał, 2012. "Product differentiation decisions under ambiguous consumer demand and pessimistic expectations," International Journal of Industrial Organization, Elsevier, vol. 30(6), pages 593-604.
    7. Loebbing, Jonas, 2018. "An Elementary Theory of Endogenous Technical Change and Wage Inequality," VfS Annual Conference 2018 (Freiburg, Breisgau): Digital Economy 181603, Verein für Socialpolitik / German Economic Association.
    8. Zhi Chen & Melvyn Sim & Huan Xu, 2019. "Distributionally Robust Optimization with Infinitely Constrained Ambiguity Sets," Operations Research, INFORMS, vol. 67(5), pages 1328-1344, September.
    9. Stefanie Stantcheva, 2020. "Dynamic Taxation," Annual Review of Economics, Annual Reviews, vol. 12(1), pages 801-831, August.
    10. Drouhin, Nicolas, 2015. "A rank-dependent utility model of uncertain lifetime," Journal of Economic Dynamics and Control, Elsevier, vol. 53(C), pages 208-224.
    11. Alain Chateauneuf & Patrick Moyes, 2005. "Lorenz non-consistent welfare and inequality measurement," The Journal of Economic Inequality, Springer;Society for the Study of Economic Inequality, vol. 2(2), pages 61-87, January.
    12. Koessler, Frédéric & Skreta, Vasiliki, 2016. "Informed seller with taste heterogeneity," Journal of Economic Theory, Elsevier, vol. 165(C), pages 456-471.
    13. Epstein, Larry G. & Zin, Stanley E., 2001. "The independence axiom and asset returns," Journal of Empirical Finance, Elsevier, vol. 8(5), pages 537-572, December.
    14. Samuel Daudin, 2022. "Optimal Control of Diffusion Processes with Terminal Constraint in Law," Journal of Optimization Theory and Applications, Springer, vol. 195(1), pages 1-41, October.
    15. Itzhak Gilboa & Andrew Postlewaite & Larry Samuelson & David Schmeidler, 2019. "What are axiomatizations good for?," Theory and Decision, Springer, vol. 86(3), pages 339-359, May.
    16. Filiz-Ozbay, Emel & Guryan, Jonathan & Hyndman, Kyle & Kearney, Melissa & Ozbay, Erkut Y., 2015. "Do lottery payments induce savings behavior? Evidence from the lab," Journal of Public Economics, Elsevier, vol. 126(C), pages 1-24.
    17. Cerreia-Vioglio, Simone & Maccheroni, Fabio & Marinacci, Massimo & Montrucchio, Luigi, 2012. "Probabilistic sophistication, second order stochastic dominance and uncertainty aversion," Journal of Mathematical Economics, Elsevier, vol. 48(5), pages 271-283.
    18. Eduardo Perez & Delphine Prady, 2012. "Complicating to Persuade?," Working Papers hal-03583827, HAL.
    19. ,, 2014. "Second order beliefs models of choice under imprecise risk: non-additive second order beliefs vs. nonlinear second order utility," Theoretical Economics, Econometric Society, vol. 9(3), September.
    20. Stephen P. Jenkins & Philippe Van Kerm, 2016. "Assessing Individual Income Growth," Economica, London School of Economics and Political Science, vol. 83(332), pages 679-703, October.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2108.10403. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.