Learn Continuously, Act Discretely: Hybrid Action-Space Reinforcement Learning For Optimal Execution

My bibliography Save this paper

Learn Continuously, Act Discretely: Hybrid Action-Space Reinforcement Learning For Optimal Execution

Author

Listed:

Feiyang Pan
Tongzhe Zhang
Ling Luo
Jia He
Shuoling Liu

Registered:

Abstract

Optimal execution is a sequential decision-making problem for cost-saving in algorithmic trading. Studies have found that reinforcement learning (RL) can help decide the order-splitting sizes. However, a problem remains unsolved: how to place limit orders at appropriate limit prices? The key challenge lies in the "continuous-discrete duality" of the action space. On the one hand, the continuous action space using percentage changes in prices is preferred for generalization. On the other hand, the trader eventually needs to choose limit prices discretely due to the existence of the tick size, which requires specialization for every single stock with different characteristics (e.g., the liquidity and the price range). So we need continuous control for generalization and discrete control for specialization. To this end, we propose a hybrid RL method to combine the advantages of both of them. We first use a continuous control agent to scope an action subset, then deploy a fine-grained agent to choose a specific limit price. Extensive experiments show that our method has higher sample efficiency and better training stability than existing RL algorithms and significantly outperforms previous learning-based methods for order execution.

Suggested Citation

Feiyang Pan & Tongzhe Zhang & Ling Luo & Jia He & Shuoling Liu, 2022. "Learn Continuously, Act Discretely: Hybrid Action-Space Reinforcement Learning For Optimal Execution," Papers 2207.11152, arXiv.org.

Handle: RePEc:arx:papers:2207.11152

Download full text from publisher

References listed on IDEAS

Yuchen Fang & Kan Ren & Weiqing Liu & Dong Zhou & Weinan Zhang & Jiang Bian & Yong Yu & Tie-Yan Liu, 2021. "Universal Trading for Order Execution with Oracle Policy Distillation," Papers 2103.10860, arXiv.org.
Dieter Hendricks & Diane Wilcox, 2014. "A reinforcement learning extension to the Almgren-Chriss model for optimal trade execution," Papers 1403.2229, arXiv.org.
Bertsimas, Dimitris & Lo, Andrew W., 1998. "Optimal control of execution costs," Journal of Financial Markets, Elsevier, vol. 1(1), pages 1-50, April.

Full references (including those not matched with items on IDEAS)

Citations

Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.

Cited by:

Dapeng Li & Feiyang Pan & Jia He & Zhiwei Xu & Dandan Tu & Guoliang Fan, 2023. "Style Miner: Find Significant and Stable Explanatory Factors in Time Series with Constrained Reinforcement Learning," Papers 2303.11716, arXiv.org.

Most related items

These are the items that most often cite the same works as this one and are cited by the same works as this one.

Shuo Sun & Rundong Wang & Bo An, 2021. "Reinforcement Learning for Quantitative Trading," Papers 2109.13851, arXiv.org.
Xiaodong Li & Pangjing Wu & Chenxin Zou & Qing Li, 2022. "Hierarchical Deep Reinforcement Learning for VWAP Strategy Optimization," Papers 2212.14670, arXiv.org.
Woo Jae Byun & Bumkyu Choi & Seongmin Kim & Joohyun Jo, 2023. "Practical Application of Deep Reinforcement Learning to Optimal Trade Execution," FinTech, MDPI, vol. 2(3), pages 1-16, June.
Schnaubelt, Matthias, 2022. "Deep reinforcement learning for the optimal placement of cryptocurrency limit orders," European Journal of Operational Research, Elsevier, vol. 296(3), pages 993-1006.
Dieter Hendricks, 2016. "Using real-time cluster configurations of streaming asynchronous features as online state descriptors in financial markets," Papers 1603.06805, arXiv.org, revised May 2017.
Söhnke M. Bartram & Jürgen Branke & Mehrshad Motahari, 2020. "Artificial intelligence in asset management," Working Papers 20202001, Cambridge Judge Business School, University of Cambridge.
- Bartram, SÃ¶hnke & Branke, JÃ¼rgen & Motahari, Mehrshad, 2020. "Artificial Intelligence in Asset Management," CEPR Discussion Papers 14525, C.E.P.R. Discussion Papers.
Dicks, Matthew & Paskaramoorthy, Andrew & Gebbie, Tim, 2024. "A simple learning agent interacting with an agent-based market model," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 633(C).
Soohan Kim & Jimyeong Kim & Hong Kee Sul & Youngjoon Hong, 2023. "An Adaptive Dual-level Reinforcement Learning Approach for Optimal Trade Execution," Papers 2307.10649, arXiv.org.
Schnaubelt, Matthias, 2020. "Deep reinforcement learning for the optimal placement of cryptocurrency limit orders," FAU Discussion Papers in Economics 05/2020, Friedrich-Alexander University Erlangen-Nuremberg, Institute for Economics.
Yuchen Fang & Kan Ren & Weiqing Liu & Dong Zhou & Weinan Zhang & Jiang Bian & Yong Yu & Tie-Yan Liu, 2021. "Universal Trading for Order Execution with Oracle Policy Distillation," Papers 2103.10860, arXiv.org.
Bokai Cao & Saizhuo Wang & Xinyi Lin & Xiaojun Wu & Haohan Zhang & Lionel M. Ni & Jian Guo, 2025. "From Deep Learning to LLMs: A survey of AI in Quantitative Investment," Papers 2503.21422, arXiv.org.
Gianbiagio Curato & Jim Gatheral & Fabrizio Lillo, 2014. "Optimal execution with nonlinear transient market impact," Papers 1412.4839, arXiv.org.
Curatola, Giuliano, 2022. "Price impact, strategic interaction and portfolio choice," The North American Journal of Economics and Finance, Elsevier, vol. 59(C).
Xiaoyue Li & John M. Mulvey, 2023. "Optimal Portfolio Execution in a Regime-switching Market with Non-linear Impact Costs: Combining Dynamic Program and Neural Network," Papers 2306.08809, arXiv.org.
Hong, Harrison & Rady, Sven, 2002. "Strategic trading and learning about liquidity," Journal of Financial Markets, Elsevier, vol. 5(4), pages 419-450, October.
- Hong, Harrison & Rady, Sven, 2000. "Strategic trading and learning about liquidity," LSE Research Online Documents on Economics 119102, London School of Economics and Political Science, LSE Library.
- Rady, Sven & Hong, Harrison G, 2000. "Strategic Trading And Learning About Liquidity," CEPR Discussion Papers 2416, C.E.P.R. Discussion Papers.
- Harrison Hong & Sven Rady, 2000. "Strategic Trading and Learning about Liquidity," Econometric Society World Congress 2000 Contributed Papers 1351, Econometric Society.
- Hong, Harrison & Rady, Sven, 2001. "Strategic Trading and Learning about Liquidity," Discussion Papers in Economics 15, University of Munich, Department of Economics.
- Harrison Hong & Sven Rady, 2000. "Strategic Trading and Learning About Liquidity," FMG Discussion Papers dp356, Financial Markets Group.
Olivier Guéant & Charles-Albert Lehalle, 2015. "General Intensity Shapes In Optimal Liquidation," Mathematical Finance, Wiley Blackwell, vol. 25(3), pages 457-495, July.
- Olivier Gu'eant & Charles-Albert Lehalle, 2012. "General Intensity Shapes in Optimal Liquidation," Papers 1204.0148, arXiv.org, revised Jun 2013.
Gniadkowska-Szymańska Agata, 2017. "The impact of trading liquidity on the rate of return on emerging markets: the example of Poland and the Baltic countries," Financial Internet Quarterly (formerly e-Finanse), Sciendo, vol. 13(4), pages 136-148, December.
Konishi, Hizuru, 2002. "Optimal slice of a VWAP trade," Journal of Financial Markets, Elsevier, vol. 5(2), pages 197-221, April.
Claudio Bellani & Damiano Brigo, 2021. "Mechanics of good trade execution in the framework of linear temporary market impact," Quantitative Finance, Taylor & Francis Journals, vol. 21(1), pages 143-163, January.
- Claudio Bellani & Damiano Brigo, 2019. "Mechanics of good trade execution in the framework of linear temporary market impact," Papers 1909.10464, arXiv.org, revised Jul 2020.
Schoeneborn, Torsten & Schied, Alexander, 2007. "Liquidation in the Face of Adversity: Stealth Vs. Sunshine Trading, Predatory Trading Vs. Liquidity Provision," MPRA Paper 5548, University Library of Munich, Germany.

More about this item

NEP fields

This paper has been announced in the following NEP Reports:

NEP-CMP-2022-08-29 (Computational Economics)

Statistics

Access and download statistics

Corrections

All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2207.11152. See general information about how to correct material in RePEc.

If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

Please note that corrections may take a couple of weeks to filter through the various RePEc services.

IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.

Browse Econ Literature

More features

Learn Continuously, Act Discretely: Hybrid Action-Space Reinforcement Learning For Optimal Execution

Author

Abstract

Suggested Citation

Download full text from publisher

References listed on IDEAS

Citations

Most related items

More about this item

NEP fields

Statistics

Corrections

More services and features

MyIDEAS

Author registration

Rankings

RePEc Genealogy

RePEc Biblio

MPRA

New papers by email

EconAcademics

Plagiarism

About RePEc

RePEc home

Blog

Help/FAQ

RePEc team

Participating archives

Privacy statement

Help us

Corrections

Volunteers

Get papers listed

Open a RePEc archive

Get RePEc data