IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2207.11152.html
   My bibliography  Save this paper

Learn Continuously, Act Discretely: Hybrid Action-Space Reinforcement Learning For Optimal Execution

Author

Listed:
  • Feiyang Pan
  • Tongzhe Zhang
  • Ling Luo
  • Jia He
  • Shuoling Liu

Abstract

Optimal execution is a sequential decision-making problem for cost-saving in algorithmic trading. Studies have found that reinforcement learning (RL) can help decide the order-splitting sizes. However, a problem remains unsolved: how to place limit orders at appropriate limit prices? The key challenge lies in the "continuous-discrete duality" of the action space. On the one hand, the continuous action space using percentage changes in prices is preferred for generalization. On the other hand, the trader eventually needs to choose limit prices discretely due to the existence of the tick size, which requires specialization for every single stock with different characteristics (e.g., the liquidity and the price range). So we need continuous control for generalization and discrete control for specialization. To this end, we propose a hybrid RL method to combine the advantages of both of them. We first use a continuous control agent to scope an action subset, then deploy a fine-grained agent to choose a specific limit price. Extensive experiments show that our method has higher sample efficiency and better training stability than existing RL algorithms and significantly outperforms previous learning-based methods for order execution.

Suggested Citation

  • Feiyang Pan & Tongzhe Zhang & Ling Luo & Jia He & Shuoling Liu, 2022. "Learn Continuously, Act Discretely: Hybrid Action-Space Reinforcement Learning For Optimal Execution," Papers 2207.11152, arXiv.org.
  • Handle: RePEc:arx:papers:2207.11152
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2207.11152
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Yuchen Fang & Kan Ren & Weiqing Liu & Dong Zhou & Weinan Zhang & Jiang Bian & Yong Yu & Tie-Yan Liu, 2021. "Universal Trading for Order Execution with Oracle Policy Distillation," Papers 2103.10860, arXiv.org.
    2. Dieter Hendricks & Diane Wilcox, 2014. "A reinforcement learning extension to the Almgren-Chriss model for optimal trade execution," Papers 1403.2229, arXiv.org.
    3. Bertsimas, Dimitris & Lo, Andrew W., 1998. "Optimal control of execution costs," Journal of Financial Markets, Elsevier, vol. 1(1), pages 1-50, April.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Dapeng Li & Feiyang Pan & Jia He & Zhiwei Xu & Dandan Tu & Guoliang Fan, 2023. "Style Miner: Find Significant and Stable Explanatory Factors in Time Series with Constrained Reinforcement Learning," Papers 2303.11716, arXiv.org.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Shuo Sun & Rundong Wang & Bo An, 2021. "Reinforcement Learning for Quantitative Trading," Papers 2109.13851, arXiv.org.
    2. Xiaodong Li & Pangjing Wu & Chenxin Zou & Qing Li, 2022. "Hierarchical Deep Reinforcement Learning for VWAP Strategy Optimization," Papers 2212.14670, arXiv.org.
    3. Woo Jae Byun & Bumkyu Choi & Seongmin Kim & Joohyun Jo, 2023. "Practical Application of Deep Reinforcement Learning to Optimal Trade Execution," FinTech, MDPI, vol. 2(3), pages 1-16, June.
    4. Schnaubelt, Matthias, 2022. "Deep reinforcement learning for the optimal placement of cryptocurrency limit orders," European Journal of Operational Research, Elsevier, vol. 296(3), pages 993-1006.
    5. Dieter Hendricks, 2016. "Using real-time cluster configurations of streaming asynchronous features as online state descriptors in financial markets," Papers 1603.06805, arXiv.org, revised May 2017.
    6. Söhnke M. Bartram & Jürgen Branke & Mehrshad Motahari, 2020. "Artificial intelligence in asset management," Working Papers 20202001, Cambridge Judge Business School, University of Cambridge.
    7. Soohan Kim & Jimyeong Kim & Hong Kee Sul & Youngjoon Hong, 2023. "An Adaptive Dual-level Reinforcement Learning Approach for Optimal Trade Execution," Papers 2307.10649, arXiv.org.
    8. Schnaubelt, Matthias, 2020. "Deep reinforcement learning for the optimal placement of cryptocurrency limit orders," FAU Discussion Papers in Economics 05/2020, Friedrich-Alexander University Erlangen-Nuremberg, Institute for Economics.
    9. Yuchen Fang & Kan Ren & Weiqing Liu & Dong Zhou & Weinan Zhang & Jiang Bian & Yong Yu & Tie-Yan Liu, 2021. "Universal Trading for Order Execution with Oracle Policy Distillation," Papers 2103.10860, arXiv.org.
    10. Gianbiagio Curato & Jim Gatheral & Fabrizio Lillo, 2014. "Optimal execution with nonlinear transient market impact," Papers 1412.4839, arXiv.org.
    11. Curatola, Giuliano, 2022. "Price impact, strategic interaction and portfolio choice," The North American Journal of Economics and Finance, Elsevier, vol. 59(C).
    12. Xiaoyue Li & John M. Mulvey, 2023. "Optimal Portfolio Execution in a Regime-switching Market with Non-linear Impact Costs: Combining Dynamic Program and Neural Network," Papers 2306.08809, arXiv.org.
    13. Hong, Harrison & Rady, Sven, 2002. "Strategic trading and learning about liquidity," Journal of Financial Markets, Elsevier, vol. 5(4), pages 419-450, October.
    14. Olivier Guéant & Charles-Albert Lehalle, 2015. "General Intensity Shapes In Optimal Liquidation," Mathematical Finance, Wiley Blackwell, vol. 25(3), pages 457-495, July.
    15. Gniadkowska-Szymańska Agata, 2017. "The impact of trading liquidity on the rate of return on emerging markets: the example of Poland and the Baltic countries," Financial Internet Quarterly (formerly e-Finanse), Sciendo, vol. 13(4), pages 136-148, December.
    16. Konishi, Hizuru, 2002. "Optimal slice of a VWAP trade," Journal of Financial Markets, Elsevier, vol. 5(2), pages 197-221, April.
    17. Claudio Bellani & Damiano Brigo, 2021. "Mechanics of good trade execution in the framework of linear temporary market impact," Quantitative Finance, Taylor & Francis Journals, vol. 21(1), pages 143-163, January.
    18. Schoeneborn, Torsten & Schied, Alexander, 2007. "Liquidation in the Face of Adversity: Stealth Vs. Sunshine Trading, Predatory Trading Vs. Liquidity Provision," MPRA Paper 5548, University Library of Munich, Germany.
    19. Fengpei Li & Vitalii Ihnatiuk & Ryan Kinnear & Anderson Schneider & Yuriy Nevmyvaka, 2022. "Do price trajectory data increase the efficiency of market impact estimation?," Papers 2205.13423, arXiv.org, revised Mar 2023.
    20. Samuel N. Cohen & Lukasz Szpruch, 2011. "A limit order book model for latency arbitrage," Papers 1110.4811, arXiv.org.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2207.11152. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.