An Adaptive Dual-level Reinforcement Learning Approach for Optimal Trade Execution

My bibliography Save this paper

An Adaptive Dual-level Reinforcement Learning Approach for Optimal Trade Execution

Author

Listed:

Soohan Kim
Jimyeong Kim
Hong Kee Sul
Youngjoon Hong

Registered:

Abstract

The purpose of this research is to devise a tactic that can closely track the daily cumulative volume-weighted average price (VWAP) using reinforcement learning. Previous studies often choose a relatively short trading horizon to implement their models, making it difficult to accurately track the daily cumulative VWAP since the variations of financial data are often insignificant within the short trading horizon. In this paper, we aim to develop a strategy that can accurately track the daily cumulative VWAP while minimizing the deviation from the VWAP. We propose a method that leverages the U-shaped pattern of intraday stock trade volumes and use Proximal Policy Optimization (PPO) as the learning algorithm. Our method follows a dual-level approach: a Transformer model that captures the overall(global) distribution of daily volumes in a U-shape, and a LSTM model that handles the distribution of orders within smaller(local) time intervals. The results from our experiments suggest that this dual-level architecture improves the accuracy of approximating the cumulative VWAP, when compared to previous reinforcement learning-based models.

Suggested Citation

Soohan Kim & Jimyeong Kim & Hong Kee Sul & Youngjoon Hong, 2023. "An Adaptive Dual-level Reinforcement Learning Approach for Optimal Trade Execution," Papers 2307.10649, arXiv.org.

Handle: RePEc:arx:papers:2307.10649

Download full text from publisher

References listed on IDEAS

Bialkowski, Jedrzej & Darolles, Serge & Le Fol, Gaëlle, 2008. "Improving VWAP strategies: A dynamic volume approach," Journal of Banking & Finance, Elsevier, vol. 32(9), pages 1709-1722, September.
- Jedrzej Białkowski & Serge Darolles & Gaëlle Le Fol, 2006. "Improving VWAP strategies: A dynamical volume approach," Documents de recherche 06-08, Centre d'Études des Politiques Économiques (EPEE), Université d'Evry Val d'Essonne.
- Jedrzej Bialkowski & Serge Darolles & Gaëlle Le Fol, 2008. "Improving VWAP strategies: A dynamic volume approach," Post-Print halshs-00676946, HAL.
- Jędrzej Białkowski & Serge Darolles & Gaëlle Le Fol, 2008. "Improving VWAP strategies: A dynamic volume approach," Post-Print hal-02877984, HAL.
Brian Ning & Franco Ho Ting Lin & Sebastian Jaimungal, 2021. "Double Deep Q-Learning for Optimal Execution," Applied Mathematical Finance, Taylor & Francis Journals, vol. 28(4), pages 361-380, July.
Jain, Prem C. & Joh, Gun-Ho, 1988. "The Dependence between Hourly Prices and Trading Volume," Journal of Financial and Quantitative Analysis, Cambridge University Press, vol. 23(3), pages 269-283, September.
Dieter Hendricks & Diane Wilcox, 2014. "A reinforcement learning extension to the Almgren-Chriss model for optimal trade execution," Papers 1403.2229, arXiv.org.
Bertsimas, Dimitris & Lo, Andrew W., 1998. "Optimal control of execution costs," Journal of Financial Markets, Elsevier, vol. 1(1), pages 1-50, April.
Berkowitz, Stephen A & Logue, Dennis E & Noser, Eugene A, Jr, 1988. " The Total Cost of Transactions on the NYSE," Journal of Finance, American Finance Association, vol. 43(1), pages 97-112, March.

Full references (including those not matched with items on IDEAS)

Most related items

These are the items that most often cite the same works as this one and are cited by the same works as this one.

Woo Jae Byun & Bumkyu Choi & Seongmin Kim & Joohyun Jo, 2023. "Practical Application of Deep Reinforcement Learning to Optimal Trade Execution," FinTech, MDPI, vol. 2(3), pages 1-16, June.
Yuchen Fang & Kan Ren & Weiqing Liu & Dong Zhou & Weinan Zhang & Jiang Bian & Yong Yu & Tie-Yan Liu, 2021. "Universal Trading for Order Execution with Oracle Policy Distillation," Papers 2103.10860, arXiv.org.
Xiaodong Li & Pangjing Wu & Chenxin Zou & Qing Li, 2022. "Hierarchical Deep Reinforcement Learning for VWAP Strategy Optimization," Papers 2212.14670, arXiv.org.
Konishi, Hizuru, 2002. "Optimal slice of a VWAP trade," Journal of Financial Markets, Elsevier, vol. 5(2), pages 197-221, April.
Steven L. Heston & Robert A. Korajczyk & Ronnie Sadka, 2010. "Intraday Patterns in the Cross‐section of Stock Returns," Journal of Finance, American Finance Association, vol. 65(4), pages 1369-1407, August.
- Steven L. Heston & Robert A. Korajczyk & Ronnie Sadka, 2010. "Intraday Patterns in the Cross-section of Stock Returns," Papers 1005.3535, arXiv.org.
Wei Cui & Anthony Brabazon & Michael O'Neill, 2011. "Dynamic trade execution: a grammatical evolution approach," International Journal of Financial Markets and Derivatives, Inderscience Enterprises Ltd, vol. 2(1/2), pages 4-31.
Choi, Jin Hyuk & Larsen, Kasper & Seppi, Duane J., 2019. "Information and trading targets in a dynamic market equilibrium," Journal of Financial Economics, Elsevier, vol. 132(3), pages 22-49.
Christopher Kath & Florian Ziel, 2020. "Optimal Order Execution in Intraday Markets: Minimizing Costs in Trade Trajectories," Papers 2009.07892, arXiv.org, revised Oct 2020.
Schnaubelt, Matthias, 2022. "Deep reinforcement learning for the optimal placement of cryptocurrency limit orders," European Journal of Operational Research, Elsevier, vol. 296(3), pages 993-1006.
Du, Bian & Zhu, Hongliang & Zhao, Jingdong, 2016. "Optimal execution in high-frequency trading with Bayesian learning," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 461(C), pages 767-777.
Michael J. O'Neill & Geoffrey J. Warren, 2019. "Evaluating fund capacity: issues and methods," Accounting and Finance, Accounting and Finance Association of Australia and New Zealand, vol. 59(S1), pages 773-800, April.
Olivier Guéant, 2016. "The Financial Mathematics of Market Liquidity: From Optimal Execution to Market Making," Post-Print hal-01393136, HAL.
Jedrzej Bialkowski & Serge Darolles & Gaëlle Le Fol, 2012. "Reducing the risk of VWAP orders execution - A new approach to modeling intra-day volume," Post-Print hal-01632822, HAL.
Qing-Qing Yang & Wai-Ki Ching & Jia-Wen Gu & Tak-Kuen Siu, 2016. "Generalized Optimal Liquidation Problems Across Multiple Trading Venues," Papers 1607.04553, arXiv.org, revised Aug 2017.
Dieter Hendricks, 2016. "Using real-time cluster configurations of streaming asynchronous features as online state descriptors in financial markets," Papers 1603.06805, arXiv.org, revised May 2017.
Shuo Sun & Rundong Wang & Bo An, 2021. "Reinforcement Learning for Quantitative Trading," Papers 2109.13851, arXiv.org.
Söhnke M. Bartram & Jürgen Branke & Mehrshad Motahari, 2020. "Artificial intelligence in asset management," Working Papers 20202001, Cambridge Judge Business School, University of Cambridge.
- Bartram, SÃ¶hnke & Branke, JÃ¼rgen & Motahari, Mehrshad, 2020. "Artificial Intelligence in Asset Management," CEPR Discussion Papers 14525, C.E.P.R. Discussion Papers.
Mooradian, Robert M., 2010. "Illiquidity and Stock Returns," Review of Applied Economics, Lincoln University, Department of Financial and Business Systems, vol. 6(1-2), pages 1-19, April.
Feiyang Pan & Tongzhe Zhang & Ling Luo & Jia He & Shuoling Liu, 2022. "Learn Continuously, Act Discretely: Hybrid Action-Space Reinforcement Learning For Optimal Execution," Papers 2207.11152, arXiv.org.
Enzo Busseti & Stephen Boyd, 2015. "Volume Weighted Average Price Optimal Execution," Papers 1509.08503, arXiv.org.

More about this item

NEP fields

This paper has been announced in the following NEP Reports:

NEP-CMP-2023-08-28 (Computational Economics)

Statistics

Access and download statistics

Corrections

All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2307.10649. See general information about how to correct material in RePEc.

If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

Please note that corrections may take a couple of weeks to filter through the various RePEc services.

IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.

Browse Econ Literature

More features

An Adaptive Dual-level Reinforcement Learning Approach for Optimal Trade Execution

Author

Abstract

Suggested Citation

Download full text from publisher

References listed on IDEAS

Most related items

More about this item

NEP fields

Statistics

Corrections

More services and features

MyIDEAS

Author registration

Rankings

RePEc Genealogy

RePEc Biblio

MPRA

New papers by email

EconAcademics

Plagiarism

About RePEc

RePEc home

Blog

Help/FAQ

RePEc team

Participating archives

Privacy statement

Help us

Corrections

Volunteers

Get papers listed

Open a RePEc archive

Get RePEc data