IDEAS home Printed from https://ideas.repec.org/p/arx/papers/1904.05384.html
   My bibliography  Save this paper

Feature Engineering for Mid-Price Prediction with Deep Learning

Author

Listed:
  • Adamantios Ntakaris
  • Giorgio Mirone
  • Juho Kanniainen
  • Moncef Gabbouj
  • Alexandros Iosifidis

Abstract

Mid-price movement prediction based on limit order book (LOB) data is a challenging task due to the complexity and dynamics of the LOB. So far, there have been very limited attempts for extracting relevant features based on LOB data. In this paper, we address this problem by designing a new set of handcrafted features and performing an extensive experimental evaluation on both liquid and illiquid stocks. More specifically, we implement a new set of econometrical features that capture statistical properties of the underlying securities for the task of mid-price prediction. Moreover, we develop a new experimental protocol for online learning that treats the task as a multi-objective optimization problem and predicts i) the direction of the next price movement and ii) the number of order book events that occur until the change takes place. In order to predict the mid-price movement, the features are fed into nine different deep learning models based on multi-layer perceptrons (MLP), convolutional neural networks (CNN) and long short-term memory (LSTM) neural networks. The performance of the proposed method is then evaluated on liquid and illiquid stocks, which are based on TotalView-ITCH US and Nordic stocks, respectively. For some stocks, results suggest that the correct choice of a feature set and a model can lead to the successful prediction of how long it takes to have a stock price movement.

Suggested Citation

  • Adamantios Ntakaris & Giorgio Mirone & Juho Kanniainen & Moncef Gabbouj & Alexandros Iosifidis, 2019. "Feature Engineering for Mid-Price Prediction with Deep Learning," Papers 1904.05384, arXiv.org, revised Jun 2019.
  • Handle: RePEc:arx:papers:1904.05384
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/1904.05384
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Jacod, Jean & Li, Yingying & Mykland, Per A. & Podolskij, Mark & Vetter, Mathias, 2009. "Microstructure noise in the continuous case: The pre-averaging approach," Stochastic Processes and their Applications, Elsevier, vol. 119(7), pages 2249-2276, July.
    2. Dat Thanh Tran & Martin Magris & Juho Kanniainen & Moncef Gabbouj & Alexandros Iosifidis, 2017. "Tensor Representation in High-Frequency Financial Data for Price Change Prediction," Papers 1709.01268, arXiv.org, revised Nov 2017.
    3. Christensen, Kim & Oomen, Roel C.A. & Podolskij, Mark, 2014. "Fact or friction: Jumps at ultra high frequency," Journal of Financial Economics, Elsevier, vol. 114(3), pages 576-599.
    4. Ole E. Barndorff-Nielsen & Neil Shephard, 2006. "Econometrics of Testing for Jumps in Financial Economics Using Bipower Variation," Journal of Financial Econometrics, Oxford University Press, vol. 4(1), pages 1-30.
    5. Ole E. Barndorff-Nielsen, 2004. "Power and Bipower Variation with Stochastic Volatility and Jumps," Journal of Financial Econometrics, Oxford University Press, vol. 2(1), pages 1-37.
    6. Avraam Tsantekidis & Nikolaos Passalis & Anastasios Tefas & Juho Kanniainen & Moncef Gabbouj & Alexandros Iosifidis, 2018. "Using Deep Learning for price prediction by exploiting stationary limit order book features," Papers 1810.09965, arXiv.org.
    7. Martin Lettau & Sydney Ludvigson, 2001. "Consumption, Aggregate Wealth, and Expected Stock Returns," Journal of Finance, American Finance Association, vol. 56(3), pages 815-849, June.
    8. Ole E. Barndorff‐Nielsen & Neil Shephard, 2002. "Econometric analysis of realized volatility and its use in estimating stochastic volatility models," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 64(2), pages 253-280, May.
    9. Justin Sirignano & Rama Cont, 2018. "Universal features of price formation in financial markets: perspectives from Deep Learning," Papers 1803.06917, arXiv.org.
    10. Alec N. Kercheval & Yuan Zhang, 2015. "Modelling high-frequency limit order book dynamics with support vector machines," Quantitative Finance, Taylor & Francis Journals, vol. 15(8), pages 1315-1329, August.
    11. Guo, Hui, 2004. "Limited Stock Market Participation and Asset Prices in a Dynamic Economy," Journal of Financial and Quantitative Analysis, Cambridge University Press, vol. 39(3), pages 495-516, September.
    12. Brownlees, C.T. & Gallo, G.M., 2006. "Financial econometric analysis at ultra-high frequency: Data handling concerns," Computational Statistics & Data Analysis, Elsevier, vol. 51(4), pages 2232-2245, December.
    13. Ban Zheng & Eric Moulines & Fr'ed'eric Abergel, 2012. "Price Jump Prediction in Limit Order Book," Papers 1204.1381, arXiv.org.
    14. Zhang, Lan & Mykland, Per A. & Ait-Sahalia, Yacine, 2005. "A Tale of Two Time Scales: Determining Integrated Volatility With Noisy High-Frequency Data," Journal of the American Statistical Association, American Statistical Association, vol. 100, pages 1394-1411, December.
    15. Sima Siami-Namini & Akbar Siami Namin, 2018. "Forecasting Economics and Financial Time Series: ARIMA vs. LSTM," Papers 1803.06386, arXiv.org.
    16. Chung, Kee H. & Chuwonganant, Chairat, 2018. "Market volatility and stock returns: The role of liquidity providers," Journal of Financial Markets, Elsevier, vol. 37(C), pages 17-34.
    17. Adamantios Ntakaris & Martin Magris & Juho Kanniainen & Moncef Gabbouj & Alexandros Iosifidis, 2017. "Benchmark Dataset for Mid-Price Forecasting of Limit Order Book Data with Machine Learning Methods," Papers 1705.03233, arXiv.org, revised Mar 2020.
    18. Justin Sirignano & Rama Cont, 2018. "Universal features of price formation in financial markets: perspectives from Deep Learning," Working Papers hal-01754054, HAL.
    19. Ole E. Barndorff-Nielsen & Peter Reinhard Hansen & Asger Lunde & Neil Shephard, 2008. "Designing Realized Kernels to Measure the ex post Variation of Equity Prices in the Presence of Noise," Econometrica, Econometric Society, vol. 76(6), pages 1481-1536, November.
    20. O. E. Barndorff-Nielsen & P. Reinhard Hansen & A. Lunde & N. Shephard, 2009. "Realized kernels in practice: trades and quotes," Econometrics Journal, Royal Economic Society, vol. 12(3), pages 1-32, November.
    21. Justin Sirignano, 2016. "Deep Learning for Limit Order Books," Papers 1601.01987, arXiv.org, revised Jul 2016.
    22. Andersen, Torben G & Bollerslev, Tim, 1998. "Answering the Skeptics: Yes, Standard Volatility Models Do Provide Accurate Forecasts," International Economic Review, Department of Economics, University of Pennsylvania and Osaka University Institute of Social and Economic Research Association, vol. 39(4), pages 885-905, November.
    23. Gençay, Ramazan & Dacorogna, Michel & Muller, Ulrich A. & Pictet, Olivier & Olsen, Richard, 2001. "An Introduction to High-Frequency Finance," Elsevier Monographs, Elsevier, edition 1, number 9780122796715.
    24. Oomen, Roel C.A., 2006. "Properties of Realized Variance Under Alternative Sampling Schemes," Journal of Business & Economic Statistics, American Statistical Association, vol. 24, pages 219-237, April.
    25. Adamantios Ntakaris & Martin Magris & Juho Kanniainen & Moncef Gabbouj & Alexandros Iosifidis, 2018. "Benchmark dataset for mid‐price forecasting of limit order book data with machine learning methods," Journal of Forecasting, John Wiley & Sons, Ltd., vol. 37(8), pages 852-866, December.
    26. O. B. Sezer & M. Ozbayoglu & E. Dogdu, 2017. "An Artificial Neural Network-based Stock Trading System Using Technical Analysis and Big Data Framework," Papers 1712.09592, arXiv.org.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Adamantios Ntakaris & Juho Kanniainen & Moncef Gabbouj & Alexandros Iosifidis, 2020. "Mid-price prediction based on machine learning methods with technical and quantitative indicators," PLOS ONE, Public Library of Science, vol. 15(6), pages 1-39, June.
    2. Dat Thanh Tran & Juho Kanniainen & Moncef Gabbouj & Alexandros Iosifidis, 2020. "Data Normalization for Bilinear Structures in High-Frequency Financial Time-series," Papers 2003.00598, arXiv.org, revised Jul 2020.
    3. Adamantios Ntakaris & Juho Kanniainen & Moncef Gabbouj & Alexandros Iosifidis, 2019. "Mid-price Prediction Based on Machine Learning Methods with Technical and Quantitative Indicators," Papers 1907.09452, arXiv.org.
    4. Abbasimehr, Hossein & Paki, Reza, 2021. "Prediction of COVID-19 confirmed cases combining deep learning methods and Bayesian optimization," Chaos, Solitons & Fractals, Elsevier, vol. 142(C).
    5. Michael Poli & Jinkyoo Park & Ilija Ilievski, 2019. "WATTNet: Learning to Trade FX via Hierarchical Spatio-Temporal Representation of Highly Multivariate Time Series," Papers 1909.10801, arXiv.org.
    6. Parisa Golbayani & Dan Wang & Ionut Florescu, 2020. "Application of Deep Neural Networks to assess corporate Credit Rating," Papers 2003.02334, arXiv.org.
    7. Xuekui Zhang & Yuying Huang & Ke Xu & Li Xing, 2023. "Novel modelling strategies for high-frequency stock trading data," Financial Innovation, Springer;Southwestern University of Finance and Economics, vol. 9(1), pages 1-25, December.
    8. Adamantios Ntakaris & Moncef Gabbouj & Juho Kanniainen, 2023. "Optimum Output Long Short-Term Memory Cell for High-Frequency Trading Forecasting," Papers 2304.09840, arXiv.org, revised May 2023.
    9. Martin Magris & Mostafa Shabani & Alexandros Iosifidis, 2022. "Bayesian Bilinear Neural Network for Predicting the Mid-price Dynamics in Limit-Order Book Markets," Papers 2203.03613, arXiv.org, revised Jan 2023.
    10. Hong Guo & Jianwu Lin & Fanlin Huang, 2023. "Market Making with Deep Reinforcement Learning from Limit Order Books," Papers 2305.15821, arXiv.org.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Christensen, K. & Podolskij, M. & Thamrongrat, N. & Veliyev, B., 2017. "Inference from high-frequency data: A subsampling approach," Journal of Econometrics, Elsevier, vol. 197(2), pages 245-272.
    2. Liu, Lily Y. & Patton, Andrew J. & Sheppard, Kevin, 2015. "Does anything beat 5-minute RV? A comparison of realized measures across multiple asset classes," Journal of Econometrics, Elsevier, vol. 187(1), pages 293-311.
    3. Bu, Ruijun & Hizmeri, Rodrigo & Izzeldin, Marwan & Murphy, Anthony & Tsionas, Mike, 2023. "The contribution of jump signs and activity to forecasting stock price volatility," Journal of Empirical Finance, Elsevier, vol. 70(C), pages 144-164.
    4. Juho Kanniainen & Ye Yue, 2019. "The Arrival of News and Return Jumps in Stock Markets: A Nonparametric Approach," Papers 1901.02691, arXiv.org.
    5. Bollerslev, Tim & Patton, Andrew J. & Quaedvlieg, Rogier, 2016. "Exploiting the errors: A simple approach for improved volatility forecasting," Journal of Econometrics, Elsevier, vol. 192(1), pages 1-18.
    6. Christensen, Kim & Oomen, Roel & Podolskij, Mark, 2010. "Realised quantile-based estimation of the integrated variance," Journal of Econometrics, Elsevier, vol. 159(1), pages 74-98, November.
    7. Gerlach, Richard & Naimoli, Antonio & Storti, Giuseppe, 2018. "Time Varying Heteroskedastic Realized GARCH models for tracking measurement error bias in volatility forecasting," MPRA Paper 83893, University Library of Munich, Germany.
    8. Kim, Jihyun & Meddahi, Nour, 2020. "Volatility regressions with fat tails," Journal of Econometrics, Elsevier, vol. 218(2), pages 690-713.
    9. Andersen, Torben G. & Bollerslev, Tim & Christoffersen, Peter F. & Diebold, Francis X., 2013. "Financial Risk Measurement for Financial Risk Management," Handbook of the Economics of Finance, in: G.M. Constantinides & M. Harris & R. M. Stulz (ed.), Handbook of the Economics of Finance, volume 2, chapter 0, pages 1127-1220, Elsevier.
    10. Nielsen, Morten Ørregaard & Frederiksen, Per, 2008. "Finite sample accuracy and choice of sampling frequency in integrated volatility estimation," Journal of Empirical Finance, Elsevier, vol. 15(2), pages 265-286, March.
    11. Elena Ivona Dumitrescu & Georgiana-Denisa Banulescu, 2019. "Do High-frequency-based Measures Improve Conditional Covariance Forecasts?," Post-Print hal-03331122, HAL.
    12. Andersen, Torben G. & Bollerslev, Tim & Huang, Xin, 2011. "A reduced form framework for modeling volatility of speculative prices based on realized variation measures," Journal of Econometrics, Elsevier, vol. 160(1), pages 176-189, January.
    13. Torben G. Andersen & Tim Bollerslev & Per Frederiksen & Morten Ørregaard Nielsen, 2010. "Continuous-time models, realized volatilities, and testable distributional implications for daily stock returns," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 25(2), pages 233-261.
    14. Chaboud, Alain P. & Chiquoine, Benjamin & Hjalmarsson, Erik & Loretan, Mico, 2010. "Frequency of observation and the estimation of integrated volatility in deep and liquid financial markets," Journal of Empirical Finance, Elsevier, vol. 17(2), pages 212-240, March.
    15. Ysusi Carla, 2007. "Multipower Variation Under Market Microstructure Effects," Working Papers 2007-13, Banco de México.
    16. Mykland, Per A. & Zhang, Lan, 2016. "Between data cleaning and inference: Pre-averaging and robust estimators of the efficient price," Journal of Econometrics, Elsevier, vol. 194(2), pages 242-262.
    17. Giorgio Mirone, 2017. "Inference from the futures: ranking the noise cancelling accuracy of realized measures," CREATES Research Papers 2017-24, Department of Economics and Business Economics, Aarhus University.
    18. Patton, Andrew J., 2011. "Data-based ranking of realised volatility estimators," Journal of Econometrics, Elsevier, vol. 161(2), pages 284-303, April.
    19. Hansen, Peter R. & Lunde, Asger, 2006. "Realized Variance and Market Microstructure Noise," Journal of Business & Economic Statistics, American Statistical Association, vol. 24, pages 127-161, April.
    20. Timo Dimitriadis & Roxana Halbleib & Jeannine Polivka & Jasper Rennspies & Sina Streicher & Axel Friedrich Wolter, 2022. "Efficient Sampling for Realized Variance Estimation in Time-Changed Diffusion Models," Papers 2212.11833, arXiv.org, revised Dec 2023.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:1904.05384. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.