IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2012.11715.html
   My bibliography  Save this paper

Off-Policy Optimization of Portfolio Allocation Policies under Constraints

Author

Listed:
  • Nymisha Bandi
  • Theja Tulabandhula

Abstract

The dynamic portfolio optimization problem in finance frequently requires learning policies that adhere to various constraints, driven by investor preferences and risk. We motivate this problem of finding an allocation policy within a sequential decision making framework and study the effects of: (a) using data collected under previously employed policies, which may be sub-optimal and constraint-violating, and (b) imposing desired constraints while computing near-optimal policies with this data. Our framework relies on solving a minimax objective, where one player evaluates policies via off-policy estimators, and the opponent uses an online learning strategy to control constraint violations. We extensively investigate various choices for off-policy estimation and their corresponding optimization sub-routines, and quantify their impact on computing constraint-aware allocation policies. Our study shows promising results for constructing such policies when back-tested on historical equities data, under various regimes of operation, dimensionality and constraints.

Suggested Citation

  • Nymisha Bandi & Theja Tulabandhula, 2020. "Off-Policy Optimization of Portfolio Allocation Policies under Constraints," Papers 2012.11715, arXiv.org.
  • Handle: RePEc:arx:papers:2012.11715
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2012.11715
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Sarah Perrin & Thierry Roncalli, 2019. "Machine Learning Optimization Algorithms & Portfolio Allocation," Papers 1909.10233, arXiv.org.
    2. Zhengyao Jiang & Dixing Xu & Jinjun Liang, 2017. "A Deep Reinforcement Learning Framework for the Financial Portfolio Management Problem," Papers 1706.10059, arXiv.org, revised Jul 2017.
    3. XingYu Fu & JinHong Du & YiFeng Guo & MingWen Liu & Tao Dong & XiuWen Duan, 2018. "A Machine Learning Framework for Stock Selection," Papers 1806.01743, arXiv.org, revised Aug 2018.
    4. Freund, Yoav & Schapire, Robert E., 1999. "Adaptive Game Playing Using Multiplicative Weights," Games and Economic Behavior, Elsevier, vol. 29(1-2), pages 79-103, October.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Ahmet Murat Ozbayoglu & Mehmet Ugur Gudelek & Omer Berat Sezer, 2020. "Deep Learning for Financial Applications : A Survey," Papers 2002.05786, arXiv.org.
    2. Jiahua Xu & Daniel Perez & Yebo Feng & Benjamin Livshits, 2023. "Auto.gov: Learning-based On-chain Governance for Decentralized Finance (DeFi)," Papers 2302.09551, arXiv.org, revised May 2023.
    3. Amir Mosavi & Pedram Ghamisi & Yaser Faghan & Puhong Duan, 2020. "Comprehensive Review of Deep Reinforcement Learning Methods and Applications in Economics," Papers 2004.01509, arXiv.org.
    4. Reza Bradrania & Davood Pirayesh Neghab & Mojtaba Shafizadeh, 2022. "State-dependent stock selection in index tracking: a machine learning approach," Financial Markets and Portfolio Management, Springer;Swiss Society for Financial Market Research, vol. 36(1), pages 1-28, March.
    5. Karl Schlag & Andriy Zapechelnyuk, 2009. "Decision Making in Uncertain and Changing Environments," Discussion Papers 19, Kyiv School of Economics.
    6. Alexandre Carbonneau & Fr'ed'eric Godin, 2021. "Deep equal risk pricing of financial derivatives with non-translation invariant risk measures," Papers 2107.11340, arXiv.org.
    7. Fischer, Thomas G., 2018. "Reinforcement learning in financial markets - a survey," FAU Discussion Papers in Economics 12/2018, Friedrich-Alexander University Erlangen-Nuremberg, Institute for Economics.
    8. Greg Lewis & Vasilis Syrgkanis, 2018. "Adversarial Generalized Method of Moments," Papers 1803.07164, arXiv.org, revised Apr 2018.
    9. Charl Maree & Christian W. Omlin, 2022. "Balancing Profit, Risk, and Sustainability for Portfolio Management," Papers 2207.02134, arXiv.org.
    10. Márton Gosztonyi & Csákné Filep Judit, 2022. "Profiling (Non-)Nascent Entrepreneurs in Hungary Based on Machine Learning Approaches," Sustainability, MDPI, vol. 14(6), pages 1-20, March.
    11. Emerson Melo, 2021. "Learning in Random Utility Models Via Online Decision Problems," Papers 2112.10993, arXiv.org, revised Aug 2022.
    12. Mei-Li Shen & Cheng-Feng Lee & Hsiou-Hsiang Liu & Po-Yin Chang & Cheng-Hong Yang, 2021. "An Effective Hybrid Approach for Forecasting Currency Exchange Rates," Sustainability, MDPI, vol. 13(5), pages 1-29, March.
    13. Simina Br^anzei, 2019. "Tit-for-Tat Dynamics and Market Volatility," Papers 1911.03629, arXiv.org, revised Jan 2024.
    14. Guillaume Coqueret & Tony Guida, 2020. "Training trees on tails with applications to portfolio choice," Post-Print hal-04144665, HAL.
    15. Martino Banchio & Giacomo Mantegazza, 2022. "Artificial Intelligence and Spontaneous Collusion," Papers 2202.05946, arXiv.org, revised Sep 2023.
    16. Miquel Noguer i Alonso & Sonam Srivastava, 2020. "Deep Reinforcement Learning for Asset Allocation in US Equities," Papers 2010.04404, arXiv.org.
    17. Mengying Zhu & Xiaolin Zheng & Yan Wang & Yuyuan Li & Qianqiao Liang, 2019. "Adaptive Portfolio by Solving Multi-armed Bandit via Thompson Sampling," Papers 1911.05309, arXiv.org, revised Nov 2019.
    18. Mannor, Shie & Shimkin, Nahum, 2008. "Regret minimization in repeated matrix games with variable stage duration," Games and Economic Behavior, Elsevier, vol. 63(1), pages 227-258, May.
    19. Hanyu Li & Wenhan Huang & Zhijian Duan & David Henry Mguni & Kun Shao & Jun Wang & Xiaotie Deng, 2023. "A survey on algorithms for Nash equilibria in finite normal-form games," Papers 2312.11063, arXiv.org.
    20. Josef Hofbauer & Sylvain Sorin & Yannick Viossat, 2009. "Time Average Replicator and Best Reply Dynamics," Post-Print hal-00360767, HAL.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2012.11715. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.