IDEAS home Printed from https://ideas.repec.org/p/arx/papers/1705.03233.html
   My bibliography  Save this paper

Benchmark Dataset for Mid-Price Forecasting of Limit Order Book Data with Machine Learning Methods

Author

Listed:
  • Adamantios Ntakaris
  • Martin Magris
  • Juho Kanniainen
  • Moncef Gabbouj
  • Alexandros Iosifidis

Abstract

Managing the prediction of metrics in high-frequency financial markets is a challenging task. An efficient way is by monitoring the dynamics of a limit order book to identify the information edge. This paper describes the first publicly available benchmark dataset of high-frequency limit order markets for mid-price prediction. We extracted normalized data representations of time series data for five stocks from the NASDAQ Nordic stock market for a time period of ten consecutive days, leading to a dataset of ~4,000,000 time series samples in total. A day-based anchored cross-validation experimental protocol is also provided that can be used as a benchmark for comparing the performance of state-of-the-art methodologies. Performance of baseline approaches are also provided to facilitate experimental comparisons. We expect that such a large-scale dataset can serve as a testbed for devising novel solutions of expert systems for high-frequency limit order book data analysis.

Suggested Citation

  • Adamantios Ntakaris & Martin Magris & Juho Kanniainen & Moncef Gabbouj & Alexandros Iosifidis, 2017. "Benchmark Dataset for Mid-Price Forecasting of Limit Order Book Data with Machine Learning Methods," Papers 1705.03233, arXiv.org, revised Mar 2020.
  • Handle: RePEc:arx:papers:1705.03233
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/1705.03233
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Seddon, Jonathan J.J.M. & Currie, Wendy L., 2017. "A model for unpacking big data analytics in high-frequency trading," Journal of Business Research, Elsevier, vol. 70(C), pages 300-307.
    2. Jonas Hallgren & Timo Koski, 2016. "Testing for Causality in Continuous Time Bayesian Network Models of High-Frequency Data," Papers 1601.06651, arXiv.org.
    3. Martin D. Gould & Mason A. Porter & Stacy Williams & Mark McDonald & Daniel J. Fenn & Sam D. Howison, 2010. "Limit Order Books," Papers 1012.0349, arXiv.org, revised Apr 2013.
    4. Carrion, Allen, 2013. "Very fast money: High-frequency trading on the NASDAQ," Journal of Financial Markets, Elsevier, vol. 16(4), pages 680-711.
    5. Alec N. Kercheval & Yuan Zhang, 2015. "Modelling high-frequency limit order book dynamics with support vector machines," Quantitative Finance, Taylor & Francis Journals, vol. 15(8), pages 1315-1329, August.
    6. Charles Cao & Oliver Hansch & Xiaoxin Wang, 2009. "The information content of an open limit‐order book," Journal of Futures Markets, John Wiley & Sons, Ltd., vol. 29(1), pages 16-41, January.
    7. Naes, Randi & Skjeltorp, Johannes A., 2006. "Order book characteristics and the volume-volatility relation: Empirical evidence from a limit order market," Journal of Financial Markets, Elsevier, vol. 9(4), pages 408-432, November.
    8. Martin D. Gould & Mason A. Porter & Stacy Williams & Mark McDonald & Daniel J. Fenn & Sam D. Howison, 2013. "Limit order books," Quantitative Finance, Taylor & Francis Journals, vol. 13(11), pages 1709-1742, November.
    9. Siikanen, Milla & Kanniainen, Juho & Valli, Jaakko, 2017. "Limit order books and liquidity around scheduled and non-scheduled announcements: Empirical evidence from NASDAQ Nordic," Finance Research Letters, Elsevier, vol. 21(C), pages 264-271.
    10. Cenesizoglu, Tolga & Dionne, Georges & Zhou, Xiaozhou, 2014. "Effects of the Limit Order Book on Price Dynamics," Working Papers 14-5, HEC Montreal, Canada Research Chair in Risk Management.
    11. Jonathan Brogaard & Terrence Hendershott & Ryan Riordan, 2014. "High-Frequency Trading and Price Discovery," The Review of Financial Studies, Society for Financial Studies, vol. 27(8), pages 2267-2306.
    12. Hasbrouck, Joel & Saar, Gideon, 2013. "Low-latency trading," Journal of Financial Markets, Elsevier, vol. 16(4), pages 646-679.
    13. Azeem Malik & Wing Lon Ng, 2014. "Intraday liquidity patterns in limit order books," Studies in Economics and Finance, Emerald Group Publishing Limited, vol. 31(1), pages 46-71, February.
    14. Mankad, Shawn & Michailidis, George, 2013. "Discovering the ecosystem of an electronic financial market with a dynamic machine-learning method," Algorithmic Finance, IOS Press, vol. 2(2), pages 151-165.
    15. Abhijit Sharang & Chetan Rao, 2015. "Using machine learning for medium frequency derivative portfolio trading," Papers 1512.06228, arXiv.org.
    16. Ban Zheng & Eric Moulines & Fr'ed'eric Abergel, 2012. "Price Jump Prediction in Limit Order Book," Papers 1204.1381, arXiv.org.
    17. Ranaldo, Angelo, 2004. "Order aggressiveness in limit order book markets," Journal of Financial Markets, Elsevier, vol. 7(1), pages 53-74, January.
    18. Pai, Ping-Feng & Lin, Chih-Sheng, 2005. "A hybrid ARIMA and support vector machines model in stock price forecasting," Omega, Elsevier, vol. 33(6), pages 497-505, December.
    19. Marco Avellaneda & Sasha Stoikov, 2008. "High-frequency trading in a limit order book," Quantitative Finance, Taylor & Francis Journals, vol. 8(3), pages 217-224.
    20. Siikanen, Milla & Kanniainen, Juho & Luoma, Arto, 2017. "What drives the sensitivity of limit order books to company announcement arrivals?," Economics Letters, Elsevier, vol. 159(C), pages 65-68.
    21. Jonathan J.J.M. Seddon & Wendy L. Currie, 2017. "A model for unpacking big data analytics in high-frequency trading," Post-Print hal-01404316, HAL.
    22. Nicholas T. Chan and Christian Shelton, 2001. "An Adaptive Electronic Market-Maker," Computing in Economics and Finance 2001 146, Society for Computational Economics.
    23. Justin Sirignano, 2016. "Deep Learning for Limit Order Books," Papers 1601.01987, arXiv.org, revised Jul 2016.
    24. O'Hara, Maureen & Ye, Mao, 2011. "Is market fragmentation harming market quality?," Journal of Financial Economics, Elsevier, vol. 100(3), pages 459-474, June.
    25. Sirio Aramonte & Samuel Rosen & John W. Schindler, 2017. "Assessing and Combining Financial Conditions Indexes," International Journal of Central Banking, International Journal of Central Banking, vol. 13(1), pages 1-52, February.
    26. Steve Y. Yang & Qifeng Qiao & Peter A. Beling & William T. Scherer & Andrei A. Kirilenko, 2015. "Gaussian process-based algorithmic trading strategy identification," Quantitative Finance, Taylor & Francis Journals, vol. 15(10), pages 1683-1703, October.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Adamantios Ntakaris & Giorgio Mirone & Juho Kanniainen & Moncef Gabbouj & Alexandros Iosifidis, 2019. "Feature Engineering for Mid-Price Prediction with Deep Learning," Papers 1904.05384, arXiv.org, revised Jun 2019.
    2. Martin Magris & Jiyeong Kim & Esa Rasanen & Juho Kanniainen, 2017. "Long-range Auto-correlations in Limit Order Book Markets: Inter- and Cross-event Analysis," Papers 1711.03534, arXiv.org.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Pankaj Kumar, 2021. "Deep Hawkes Process for High-Frequency Market Making," Papers 2109.15110, arXiv.org.
    2. Dat Thanh Tran & Martin Magris & Juho Kanniainen & Moncef Gabbouj & Alexandros Iosifidis, 2017. "Tensor Representation in High-Frequency Financial Data for Price Change Prediction," Papers 1709.01268, arXiv.org, revised Nov 2017.
    3. Manahov, Viktor & Hudson, Robert & Gebka, Bartosz, 2014. "Does high frequency trading affect technical analysis and market efficiency? And if so, how?," Journal of International Financial Markets, Institutions and Money, Elsevier, vol. 28(C), pages 131-157.
    4. Li, Zhicheng & Chen, Xinyun & Xing, Haipeng, 2023. "A multifactor regime-switching model for inter-trade durations in the high-frequency limit order market," Economic Modelling, Elsevier, vol. 118(C).
    5. Ligot, Stephanie & Gillet, Roland & Veryzhenko, Iryna, 2021. "Intraday volatility smile: Effects of fragmentation and high frequency trading on price efficiency," Journal of International Financial Markets, Institutions and Money, Elsevier, vol. 75(C).
    6. Thomas Spooner & Rahul Savani, 2020. "Robust Market Making via Adversarial Reinforcement Learning," Papers 2003.01820, arXiv.org, revised Jul 2020.
    7. Ibikunle, Gbenga, 2018. "Trading places: Price leadership and the competition for order flow," Journal of Empirical Finance, Elsevier, vol. 49(C), pages 178-200.
    8. Zihao Zhang & Stefan Zohren & Stephen Roberts, 2018. "DeepLOB: Deep Convolutional Neural Networks for Limit Order Books," Papers 1808.03668, arXiv.org, revised Jan 2020.
    9. Breedon, Francis & Chen, Louisa & Ranaldo, Angelo & Vause, Nicholas, 2023. "Judgment day: Algorithmic trading around the Swiss franc cap removal," Journal of International Economics, Elsevier, vol. 140(C).
    10. Schnaubelt, Matthias, 2020. "Deep reinforcement learning for the optimal placement of cryptocurrency limit orders," FAU Discussion Papers in Economics 05/2020, Friedrich-Alexander University Erlangen-Nuremberg, Institute for Economics.
    11. Ningyuan Chen & Steven Kou & Chun Wang, 2018. "A Partitioning Algorithm for Markov Decision Processes with Applications to Market Microstructure," Management Science, INFORMS, vol. 64(2), pages 784-803, February.
    12. Schnaubelt, Matthias, 2022. "Deep reinforcement learning for the optimal placement of cryptocurrency limit orders," European Journal of Operational Research, Elsevier, vol. 296(3), pages 993-1006.
    13. Abbas Haider & Hui Wang & Bryan Scotney & Glenn Hawe, 2022. "Predictive Market Making via Machine Learning," SN Operations Research Forum, Springer, vol. 3(1), pages 1-21, March.
    14. Claudio Altafini, 2016. "The Geometric Phase of Stock Trading," PLOS ONE, Public Library of Science, vol. 11(8), pages 1-13, August.
    15. Benjamin Clapham & Martin Haferkorn & Kai Zimmermann, 2023. "The Impact of High-Frequency Trading on Modern Securities Markets," Business & Information Systems Engineering: The International Journal of WIRTSCHAFTSINFORMATIK, Springer;Gesellschaft für Informatik e.V. (GI), vol. 65(1), pages 7-24, February.
    16. Thomas Spooner & John Fearnley & Rahul Savani & Andreas Koukorinis, 2018. "Market Making via Reinforcement Learning," Papers 1804.04216, arXiv.org.
    17. Cox, Justin & Woods, Donovan, 2023. "COVID-19 and market structure dynamics," Journal of Banking & Finance, Elsevier, vol. 147(C).
    18. Adamantios Ntakaris & Juho Kanniainen & Moncef Gabbouj & Alexandros Iosifidis, 2019. "Mid-price Prediction Based on Machine Learning Methods with Technical and Quantitative Indicators," Papers 1907.09452, arXiv.org.
    19. Dodd, Olga & Frijns, Bart & Indriawan, Ivan & Pascual, Roberto, 2023. "US cross-listing and domestic high-frequency trading: Evidence from Canadian stocks," Journal of Empirical Finance, Elsevier, vol. 72(C), pages 301-320.
    20. Zhicheng Li & Haipeng Xing & Xinyun Chen, 2019. "A multifactor regime-switching model for inter-trade durations in the limit order market," Papers 1912.00764, arXiv.org.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:1705.03233. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.