Robust Risk-Aware Reinforcement Learning

Robust Risk-Aware Reinforcement Learning

Author

Listed:

Sebastian Jaimungal
Silvana Pesenti
Ye Sheng Wang
Hariom Tatsat

Abstract

We present a reinforcement learning (RL) approach for robust optimisation of risk-aware performance criteria. To allow agents to express a wide variety of risk-reward profiles, we assess the value of a policy using rank dependent expected utility (RDEU). RDEU allows the agent to seek gains, while simultaneously protecting themselves against downside risk. To robustify optimal policies against model uncertainty, we assess a policy not by its distribution, but rather, by the worst possible distribution that lies within a Wasserstein ball around it. Thus, our problem formulation may be viewed as an actor/agent choosing a policy (the outer problem), and the adversary then acting to worsen the performance of that strategy (the inner problem). We develop explicit policy gradient formulae for the inner and outer problems, and show its efficacy on three prototypical financial problems: robust portfolio allocation, optimising a benchmark, and statistical arbitrage.

Suggested Citation

Sebastian Jaimungal & Silvana Pesenti & Ye Sheng Wang & Hariom Tatsat, 2021. "Robust Risk-Aware Reinforcement Learning," Papers 2108.10403, arXiv.org, revised Dec 2021.

Handle: RePEc:arx:papers:2108.10403

Download full text from publisher

References listed on IDEAS

Paul Milgrom & Ilya Segal, 2002. "Envelope Theorems for Arbitrary Choice Sets," Econometrica, Econometric Society, vol. 70(2), pages 583-601, March.
Yaari, Menahem E, 1987. "The Dual Theory of Choice under Risk," Econometrica, Econometric Society, vol. 55(1), pages 95-115, January.
Georg Pflug & David Wozabal, 2007. "Ambiguity in portfolio selection," Quantitative Finance, Taylor & Francis Journals, vol. 7(4), pages 435-442.

Full references (including those not matched with items on IDEAS)

Citations

Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.

Cited by:

Ben Hambly & Renyuan Xu & Huining Yang, 2021. "Recent Advances in Reinforcement Learning in Finance," Papers 2112.04553, arXiv.org, revised Feb 2023.
Christa Cuchiero & Guido Gazzani & Irene Klein, 2022. "Risk measures under model uncertainty: a Bayesian viewpoint," Papers 2204.07115, arXiv.org.
Sebastian Jaimungal, 2022. "Reinforcement learning and stochastic optimisation," Finance and Stochastics, Springer, vol. 26(1), pages 103-129, January.

Most related items

These are the items that most often cite the same works as this one and are cited by the same works as this one.

Volij, Oscar, 2025. "Payoff equivalence in sealed bid auctions and the dual theory of choice under risk: A correction," Economics Letters, Elsevier, vol. 247(C).
Chatterjee, Kalyan & Vijay Krishna, R., 2011. "A nonsmooth approach to nonexpected utility theory under risk," Mathematical Social Sciences, Elsevier, vol. 62(3), pages 166-175.
Silvana Pesenti & Sebastian Jaimungal, 2020. "Portfolio Optimisation within a Wasserstein Ball," Papers 2012.04500, arXiv.org, revised Jun 2022.
Xia Han & Ruodu Wang & Xun Yu Zhou, 2022. "Choquet regularization for reinforcement learning," Papers 2208.08497, arXiv.org.
Maria Andraos & Mario Ghossoub, 2026. "Incentive Pareto Efficiency in Monopoly Insurance Markets with Adverse Selection," Papers 2602.09967, arXiv.org, revised May 2026.
Viet Anh Nguyen & Soroosh Shafiee & Damir Filipovi'c & Daniel Kuhn, 2021. "Mean-Covariance Robust Risk Measurement," Papers 2112.09959, arXiv.org, revised Oct 2025.
Kaido, Hiroaki, 2017. "Asymptotically Efficient Estimation Of Weighted Average Derivatives With An Interval Censored Variable," Econometric Theory, Cambridge University Press, vol. 33(5), pages 1218-1241, October.
- Hiroaki Kaido, 2013. "Asymptotically Efficient Estimation of Weighted Average Derivatives with an Inverval Censored Variable," Boston University - Department of Economics - Working Papers Series 2013-022, Boston University - Department of Economics.
- Hiroaki Kaido, 2014. "Asymptotically efficient estimation of weighted average derivatives with an interval censored variable," CeMMAP working papers 03/14, Institute for Fiscal Studies.
- Hiroaki Kaido, 2014. "Asymptotically efficient estimation of weighted average derivatives with an interval censored variable," CeMMAP working papers CWP03/14, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
Król, Michał, 2012. "Product differentiation decisions under ambiguous consumer demand and pessimistic expectations," International Journal of Industrial Organization, Elsevier, vol. 30(6), pages 593-604.
- Michal Król, 2011. "Product differentiation decisions under ambiguous consumer demand and pessimistic expectations," Economics Discussion Paper Series 1103, Economics, The University of Manchester.
Loebbing, Jonas, 2018. "An Elementary Theory of Endogenous Technical Change and Wage Inequality," VfS Annual Conference 2018 (Freiburg, Breisgau): Digital Economy 181603, Verein für Socialpolitik / German Economic Association.
Zhi Chen & Melvyn Sim & Huan Xu, 2019. "Distributionally Robust Optimization with Infinitely Constrained Ambiguity Sets," Operations Research, INFORMS, vol. 67(5), pages 1328-1344, September.
Castaño-Martínez, A. & Pigueiras, G. & Ramos, C.D. & Sordo, M.A., 2025. "Ordering higher risks in Yaari's dual theory," Insurance: Mathematics and Economics, Elsevier, vol. 125(C).
Yingkai Li & Boli Xu, 2024. "Learning and Communication Towards Unanimous Consent," Papers 2405.18521, arXiv.org, revised Feb 2026.
Samuel Daudin, 2022. "Optimal Control of Diffusion Processes with Terminal Constraint in Law," Journal of Optimization Theory and Applications, Springer, vol. 195(1), pages 1-41, October.
Cerreia-Vioglio, Simone & Maccheroni, Fabio & Marinacci, Massimo & Montrucchio, Luigi, 2012. "Probabilistic sophistication, second order stochastic dominance and uncertainty aversion," Journal of Mathematical Economics, Elsevier, vol. 48(5), pages 271-283.
- Simone Cerreia-Vioglio & Fabio Maccheroni & Massimo Marinacci & Luigi Montrucchio, 2010. "Probabilistic Sophistication, Second Order Stochastic Dominance, and Uncertainty Aversion," Carlo Alberto Notebooks 174, Collegio Carlo Alberto.
Eduardo Perez & Delphine Prady, 2012. "Complicating to Persuade?," Working Papers hal-03583827, HAL.
- Eduardo Perez & Delphine Prady, 2012. "Complicating to Persuade?," SciencePo Working papers hal-03583827, HAL.
- Eduardo Perez-Richet & Delphine Prady, 2012. "Complicating to Persuade?," Working Papers hal-00675135, HAL.
- Eduardo Perez & Delphine Prady, 2012. "Complicating to Persuade?," Sciences Po publications info:hdl:2441/5mao0mthj59, Sciences Po.
,, 2014. "Second order beliefs models of choice under imprecise risk: non-additive second order beliefs vs. nonlinear second order utility," Theoretical Economics, Econometric Society, vol. 9(3), September.
- Raphaël Giraud, 2014. "Second order beliefs models of choice under imprecise risk: non-additive second order beliefs vs. nonlinear second order utility," Post-Print hal-02878112, HAL.
Stephen P. Jenkins & Philippe Van Kerm, 2016. "Assessing Individual Income Growth," Economica, London School of Economics and Political Science, vol. 83(332), pages 679-703, October.
- Jenkins, Stephen P. & van Kerm, Philippe, 2016. "Assessing individual income growth," LSE Research Online Documents on Economics 66995, London School of Economics and Political Science, LSE Library.
H Zank, 2004. "Deriving Rank-Dependent Expected Utility Through Probabilistic Consistency," Economics Discussion Paper Series 0409, Economics, The University of Manchester.
Goovaerts, M. J. & Dhaene, J., 1999. "Supermodular ordering and stochastic annuities," Insurance: Mathematics and Economics, Elsevier, vol. 24(3), pages 281-290, May.
Ferdinand M. Vieider, 2024. "Decisions Under Uncertainty as Bayesian Inference on Choice Options," Management Science, INFORMS, vol. 70(12), pages 9014-9030, December.

More about this item

NEP fields

This paper has been announced in the following NEP Reports:

NEP-CMP-2021-08-30 (Computational Economics)
NEP-ISF-2021-08-30 (Islamic Finance)
NEP-RMG-2021-08-30 (Risk Management)
NEP-UPT-2021-08-30 (Utility Models and Prospect Theory)

Statistics

Access and download statistics

Corrections

All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2108.10403. See general information about how to correct material in RePEc.

If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: https://arxiv.org/ .

Please note that corrections may take a couple of weeks to filter through the various RePEc services.

IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.

Browse Econ Literature

More features

Robust Risk-Aware Reinforcement Learning

Author

Abstract

Suggested Citation

Download full text from publisher

References listed on IDEAS

Citations

Most related items

More about this item

NEP fields

Statistics

Corrections

More services and features

MyIDEAS

Author registration

Rankings

RePEc Genealogy

RePEc Biblio

MPRA

New papers by email

EconAcademics

Plagiarism

About RePEc

RePEc home

Blog

Help/FAQ

RePEc team

Participating archives

Privacy statement

Help us

Corrections

Volunteers

Get papers listed

Open a RePEc archive

Get RePEc data