Hierarchical Deep Reinforcement Learning for VWAP Strategy Optimization

Hierarchical Deep Reinforcement Learning for VWAP Strategy Optimization

Author

Listed:

Xiaodong Li
Pangjing Wu
Chenxin Zou
Qing Li

Abstract

Designing an intelligent volume-weighted average price (VWAP) strategy is a critical concern for brokers, since traditional rule-based strategies are relatively static that cannot achieve a lower transaction cost in a dynamic market. Many studies have tried to minimize the cost via reinforcement learning, but there are bottlenecks in improvement, especially for long-duration strategies such as the VWAP strategy. To address this issue, we propose a deep learning and hierarchical reinforcement learning jointed architecture termed Macro-Meta-Micro Trader (M3T) to capture market patterns and execute orders from different temporal scales. The Macro Trader first allocates a parent order into tranches based on volume profiles as the traditional VWAP strategy does, but a long short-term memory neural network is used to improve the forecasting accuracy. Then the Meta Trader selects a short-term subgoal appropriate to instant liquidity within each tranche to form a mini-tranche. The Micro Trader consequently extracts the instant market state and fulfils the subgoal with the lowest transaction cost. Our experiments over stocks listed on the Shanghai stock exchange demonstrate that our approach outperforms baselines in terms of VWAP slippage, with an average cost saving of 1.16 base points compared to the optimal baseline.

Suggested Citation

Xiaodong Li & Pangjing Wu & Chenxin Zou & Qing Li, 2022. "Hierarchical Deep Reinforcement Learning for VWAP Strategy Optimization," Papers 2212.14670, arXiv.org.

Handle: RePEc:arx:papers:2212.14670

Download full text from publisher

References listed on IDEAS

Bialkowski, Jedrzej & Darolles, Serge & Le Fol, Gaëlle, 2008. "Improving VWAP strategies: A dynamic volume approach," Journal of Banking & Finance, Elsevier, vol. 32(9), pages 1709-1722, September.
- Jedrzej Białkowski & Serge Darolles & Gaëlle Le Fol, 2006. "Improving VWAP strategies: A dynamical volume approach," Documents de recherche 06-08, Centre d'Études des Politiques Économiques (EPEE), Université d'Evry Val d'Essonne.
- Jedrzej Bialkowski & Serge Darolles & Gaëlle Le Fol, 2008. "Improving VWAP strategies: A dynamic volume approach," Post-Print halshs-00676946, HAL.
- Jędrzej Białkowski & Serge Darolles & Gaëlle Le Fol, 2008. "Improving VWAP strategies: A dynamic volume approach," Post-Print hal-02877984, HAL.
Dieter Hendricks & Diane Wilcox, 2014. "A reinforcement learning extension to the Almgren-Chriss model for optimal trade execution," Papers 1403.2229, arXiv.org.
Konishi, Hizuru, 2002. "Optimal slice of a VWAP trade," Journal of Financial Markets, Elsevier, vol. 5(2), pages 197-221, April.
Bertsimas, Dimitris & Lo, Andrew W., 1998. "Optimal control of execution costs," Journal of Financial Markets, Elsevier, vol. 1(1), pages 1-50, April.
Joel Hasbrouck, 1999. "The Dynamics of Discrete Bid and Ask Quotes," Journal of Finance, American Finance Association, vol. 54(6), pages 2109-2142, December.
Yuchen Fang & Kan Ren & Weiqing Liu & Dong Zhou & Weinan Zhang & Jiang Bian & Yong Yu & Tie-Yan Liu, 2021. "Universal Trading for Order Execution with Oracle Policy Distillation," Papers 2103.10860, arXiv.org.
Gur Huberman & Werner Stanzl, 2005. "Optimal Liquidity Trading," Review of Finance, European Finance Association, vol. 9(2), pages 165-200.
- Gur Huberman & Werner Stanzl, 2005. "Optimal Liquidity Trading," Review of Finance, Springer, vol. 9(2), pages 165-200, June.
- Gur Huberman & Werner Stanzl, 2000. "Optimal Liquidity Trading," Yale School of Management Working Papers ysm165, Yale School of Management, revised 01 Aug 2001.

Full references (including those not matched with items on IDEAS)

Citations

Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.

Cited by:

Remi Genet, 2025. "Deep Learning for VWAP Execution in Crypto Markets: Beyond the Volume Curve," Papers 2502.13722, arXiv.org, revised Apr 2025.
Woo Jae Byun & Bumkyu Choi & Seongmin Kim & Joohyun Jo, 2023. "Practical Application of Deep Reinforcement Learning to Optimal Trade Execution," FinTech, MDPI, vol. 2(3), pages 1-16, June.

Most related items

These are the items that most often cite the same works as this one and are cited by the same works as this one.

Christopher Kath & Florian Ziel, 2020. "Optimal Order Execution in Intraday Markets: Minimizing Costs in Trade Trajectories," Papers 2009.07892, arXiv.org, revised Oct 2020.
Olivier Guéant, 2016. "The Financial Mathematics of Market Liquidity: From Optimal Execution to Market Making," Post-Print hal-01393136, HAL.
Shuo Sun & Rundong Wang & Bo An, 2021. "Reinforcement Learning for Quantitative Trading," Papers 2109.13851, arXiv.org.
Feiyang Pan & Tongzhe Zhang & Ling Luo & Jia He & Shuoling Liu, 2022. "Learn Continuously, Act Discretely: Hybrid Action-Space Reinforcement Learning For Optimal Execution," Papers 2207.11152, arXiv.org.
Soohan Kim & Jimyeong Kim & Hong Kee Sul & Youngjoon Hong, 2023. "An Adaptive Dual-level Reinforcement Learning Approach for Optimal Trade Execution," Papers 2307.10649, arXiv.org.
Enzo Busseti & Stephen Boyd, 2015. "Volume Weighted Average Price Optimal Execution," Papers 1509.08503, arXiv.org.
Yuchen Fang & Kan Ren & Weiqing Liu & Dong Zhou & Weinan Zhang & Jiang Bian & Yong Yu & Tie-Yan Liu, 2021. "Universal Trading for Order Execution with Oracle Policy Distillation," Papers 2103.10860, arXiv.org.
Curatola, Giuliano, 2022. "Price impact, strategic interaction and portfolio choice," The North American Journal of Economics and Finance, Elsevier, vol. 59(C).
Olivier Guéant & Charles-Albert Lehalle, 2015. "General Intensity Shapes In Optimal Liquidation," Mathematical Finance, Wiley Blackwell, vol. 25(3), pages 457-495, July.
- Olivier Gu'eant & Charles-Albert Lehalle, 2012. "General Intensity Shapes in Optimal Liquidation," Papers 1204.0148, arXiv.org, revised Jun 2013.
Samuel N. Cohen & Lukasz Szpruch, 2011. "A limit order book model for latency arbitrage," Papers 1110.4811, arXiv.org.
Somayeh Moazeni & Thomas F. Coleman & Yuying Li, 2016. "Smoothing and parametric rules for stochastic mean-CVaR optimal execution strategy," Annals of Operations Research, Springer, vol. 237(1), pages 99-120, February.
Woo Jae Byun & Bumkyu Choi & Seongmin Kim & Joohyun Jo, 2023. "Practical Application of Deep Reinforcement Learning to Optimal Trade Execution," FinTech, MDPI, vol. 2(3), pages 1-16, June.
Steven L. Heston & Robert A. Korajczyk & Ronnie Sadka, 2010. "Intraday Patterns in the Cross‐section of Stock Returns," Journal of Finance, American Finance Association, vol. 65(4), pages 1369-1407, August.
- Steven L. Heston & Robert A. Korajczyk & Ronnie Sadka, 2010. "Intraday Patterns in the Cross-section of Stock Returns," Papers 1005.3535, arXiv.org.
Oehmke, Martin, 2014. "Liquidating illiquid collateral," Journal of Economic Theory, Elsevier, vol. 149(C), pages 183-210.
Olivier Gu'eant & Guillaume Royer, 2013. "VWAP execution and guaranteed VWAP," Papers 1306.2832, arXiv.org, revised May 2014.
Griese, Knut & Kempf, Alexander, 2005. "Liquiditätsdynamik am deutschen Aktienmarkt," CFR Working Papers 05-12, University of Cologne, Centre for Financial Research (CFR).
Frey, Stefan & Sandås, Patrik, 2009. "The impact of iceberg orders in limit order books," CFR Working Papers 09-06, University of Cologne, Centre for Financial Research (CFR).
Takashi Kato, 2014. "VWAP Execution as an Optimal Strategy," Papers 1408.6118, arXiv.org, revised Jan 2017.
Ye Xunyu & Yan Rui & Li Handong, 2014. "Forecasting trading volume in the Chinese stock market based on the dynamic VWAP," Studies in Nonlinear Dynamics & Econometrics, De Gruyter, vol. 18(2), pages 125-144, April.
Olivier Guéant & Royer Guillaume, 2014. "VWAP execution and guaranteed VWAP," Post-Print hal-01393121, HAL.

More about this item

NEP fields

This paper has been announced in the following NEP Reports:

NEP-BIG-2023-01-23 (Big Data)
NEP-CMP-2023-01-23 (Computational Economics)

Statistics

Access and download statistics

Corrections

All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2212.14670. See general information about how to correct material in RePEc.

If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

Please note that corrections may take a couple of weeks to filter through the various RePEc services.

IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.

Browse Econ Literature

More features

Hierarchical Deep Reinforcement Learning for VWAP Strategy Optimization

Author

Abstract

Suggested Citation

Download full text from publisher

References listed on IDEAS

Citations

Most related items

More about this item

NEP fields

Statistics

Corrections

More services and features

MyIDEAS

Author registration

Rankings

RePEc Genealogy

RePEc Biblio

MPRA

New papers by email

EconAcademics

Plagiarism

About RePEc

RePEc home

Blog

Help/FAQ

RePEc team

Participating archives

Privacy statement

Help us

Corrections

Volunteers

Get papers listed

Open a RePEc archive

Get RePEc data