IDEAS home Printed from https://ideas.repec.org/p/zbw/iwqwdp/032016.html
   My bibliography  Save this paper

Deep neural networks, gradient-boosted trees, random forests: Statistical arbitrage on the S&P 500

Author

Listed:
  • Krauss, Christopher
  • Do, Xuan Anh
  • Huck, Nicolas

Abstract

In recent years, machine learning research has gained momentum: New developments in the field of deep learning allow for multiple levels of abstraction and are starting to supersede well-known and powerful tree-based techniques mainly operating on the original feature space. All these methods can be applied to various fields, including finance. This article implements and analyses the effectiveness of deep neural networks (DNN), gradient-boosted-trees (GBT), random forests (RAF), and a combination (ENS) of these methods in the context of statistical arbitrage. Each model is trained on lagged returns of all stocks in the S&P 500, after elimination of survivor bias. From 1992 to 2015, daily one-day-ahead trading signals are generated based on the probability forecast of a stock to outperform the general market. The highest k probabilities are converted into long and the lowest k probabilities into short positions, thus censoring the less certain middle part of the ranking. Empirical findings are promising. A simple ensemble consisting of one deep neural network, one gradient-boosted tree, and one random forest produces out-of-sample returns exceeding 0.45 percent per day for k = 10, prior to transaction costs. Irrespective of the fact that profits are declining in recent years, our findings pose a severe challenge to the semi-strong form of market efficiency.

Suggested Citation

  • Krauss, Christopher & Do, Xuan Anh & Huck, Nicolas, 2016. "Deep neural networks, gradient-boosted trees, random forests: Statistical arbitrage on the S&P 500," FAU Discussion Papers in Economics 03/2016, Friedrich-Alexander University Erlangen-Nuremberg, Institute for Economics.
  • Handle: RePEc:zbw:iwqwdp:032016
    as

    Download full text from publisher

    File URL: https://www.econstor.eu/bitstream/10419/130166/1/856307327.pdf
    Download Restriction: no

    Other versions of this item:

    References listed on IDEAS

    as
    1. Leung, Mark T. & Daouk, Hazem & Chen, An-Sing, 2000. "Forecasting stock indices: a comparison of classification and level estimation models," International Journal of Forecasting, Elsevier, vol. 16(2), pages 173-190.
    2. Fernandes, Marcelo & Medeiros, Marcelo C. & Scharth, Marcel, 2014. "Modeling and predicting the CBOE market volatility index," Journal of Banking & Finance, Elsevier, vol. 40(C), pages 1-10.
    3. François Longin & Bruno Solnik, 2001. "Extreme Correlation of International Equity Markets," Journal of Finance, American Finance Association, vol. 56(2), pages 649-676, April.
    4. Jacobs, Heiko, 2015. "What explains the dynamics of 100 anomalies?," Journal of Banking & Finance, Elsevier, vol. 57(C), pages 65-85.
    5. Khandani, Amir E. & Lo, Andrew W., 2011. "What happened to the quants in August 2007? Evidence from factors and transactions data," Journal of Financial Markets, Elsevier, vol. 14(1), pages 1-46, February.
    6. Fama, Eugene F. & French, Kenneth R., 2015. "A five-factor asset pricing model," Journal of Financial Economics, Elsevier, vol. 116(1), pages 1-22.
    7. Evan Gatev & William N. Goetzmann & K. Geert Rouwenhorst, 2006. "Pairs Trading: Performance of a Relative-Value Arbitrage Rule," Review of Financial Studies, Society for Financial Studies, vol. 19(3), pages 797-827.
    8. Pesaran, M Hashem & Timmermann, Allan, 1992. "A Simple Nonparametric Test of Predictive Performance," Journal of Business & Economic Statistics, American Statistical Association, vol. 10(4), pages 561-565, October.
    9. Nicolas Huck, 2015. "Pairs trading: does volatility timing matter?," Applied Economics, Taylor & Francis Journals, vol. 47(57), pages 6239-6256, December.
    10. Sermpinis, Georgios & Theofilatos, Konstantinos & Karathanasopoulos, Andreas & Georgopoulos, Efstratios F. & Dunis, Christian, 2013. "Forecasting foreign exchange rates with adaptive neural networks using radial-basis functions and Particle Swarm Optimization," European Journal of Operational Research, Elsevier, vol. 225(3), pages 528-540.
    11. Zhang, Guoqiang & Eddy Patuwo, B. & Y. Hu, Michael, 1998. "Forecasting with artificial neural networks:: The state of the art," International Journal of Forecasting, Elsevier, vol. 14(1), pages 35-62, March.
    12. Sadka, Ronnie, 2010. "Liquidity risk and the cross-section of hedge-fund returns," Journal of Financial Economics, Elsevier, vol. 98(1), pages 54-71, October.
    13. Friedman, Jerome H., 2002. "Stochastic gradient boosting," Computational Statistics & Data Analysis, Elsevier, vol. 38(4), pages 367-378, February.
    14. Huck, Nicolas, 2009. "Pairs selection and outranking: An application to the S&P 100 index," European Journal of Operational Research, Elsevier, vol. 196(2), pages 819-825, July.
    15. Jacobs, Heiko & Weber, Martin, 2015. "On the determinants of pairs trading profitability," Journal of Financial Markets, Elsevier, vol. 23(C), pages 75-97.
    16. Carhart, Mark M, 1997. " On Persistence in Mutual Fund Performance," Journal of Finance, American Finance Association, vol. 52(1), pages 57-82, March.
    17. Huck, Nicolas, 2010. "Pairs trading and outranking: The multi-step-ahead forecasting case," European Journal of Operational Research, Elsevier, vol. 207(3), pages 1702-1716, December.
    18. Hal R. Varian, 2014. "Big Data: New Tricks for Econometrics," Journal of Economic Perspectives, American Economic Association, vol. 28(2), pages 3-28, Spring.
    19. Marco Avellaneda & Jeong-Hyun Lee, 2010. "Statistical arbitrage in the US equities market," Quantitative Finance, Taylor & Francis Journals, vol. 10(7), pages 761-782.
    20. Timofei Bogomolov, 2013. "Pairs trading based on statistical variability of the spread process," Quantitative Finance, Taylor & Francis Journals, vol. 13(9), pages 1411-1430, September.
    21. Aiolfi, Marco & Timmermann, Allan, 2006. "Persistence in forecasting performance and conditional combination strategies," Journal of Econometrics, Elsevier, vol. 135(1-2), pages 31-53.
    22. Fama, Eugene F & French, Kenneth R, 1996. " Multifactor Explanations of Asset Pricing Anomalies," Journal of Finance, American Finance Association, vol. 51(1), pages 55-84, March.
    23. Genre, Véronique & Kenny, Geoff & Meyler, Aidan & Timmermann, Allan, 2013. "Combining expert forecasts: Can anything beat the simple average?," International Journal of Forecasting, Elsevier, vol. 29(1), pages 108-121.
    24. Mark W. Watson & James H. Stock, 2004. "Combination forecasts of output growth in a seven-country data set," Journal of Forecasting, John Wiley & Sons, Ltd., vol. 23(6), pages 405-430.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Chariton Chalvatzis & Dimitrios Hristu-Varsakelis, 2019. "High-performance stock index trading: making effective use of a deep LSTM neural network," Papers 1902.03125, arXiv.org, revised May 2019.
    2. repec:taf:quantf:v:18:y:2018:i:1:p:121-138 is not listed on IDEAS
    3. repec:eee:ejores:v:278:y:2019:i:1:p:330-342 is not listed on IDEAS
    4. Lukas Ryll & Sebastian Seidens, 2019. "Evaluating the Performance of Machine Learning Algorithms in Financial Market Forecasting: A Comprehensive Survey," Papers 1906.07786, arXiv.org, revised Jul 2019.
    5. Schnaubelt, Matthias & Fischer, Thomas G. & Krauss, Christopher, 2018. "Separating the signal from the noise - financial machine learning for Twitter," FAU Discussion Papers in Economics 14/2018, Friedrich-Alexander University Erlangen-Nuremberg, Institute for Economics.
    6. repec:eco:journ1:2017-04-76 is not listed on IDEAS
    7. Knoll, Julian & Stübinger, Johannes & Grottke, Michael, 2017. "Exploiting social media with higher-order Factorization Machines: Statistical arbitrage on high-frequency data of the S&P 500," FAU Discussion Papers in Economics 13/2017, Friedrich-Alexander University Erlangen-Nuremberg, Institute for Economics.
    8. Stübinger, Johannes & Endres, Sylvia, 2017. "Pairs trading with a mean-reverting jump-diffusion model on high-frequency data," FAU Discussion Papers in Economics 10/2017, Friedrich-Alexander University Erlangen-Nuremberg, Institute for Economics.
    9. repec:eee:ejores:v:270:y:2018:i:2:p:654-669 is not listed on IDEAS
    10. Shanka Subhra Mondal & Sharada Prasanna Mohanty & Benjamin Harlander & Mehmet Koseoglu & Lance Rane & Kirill Romanov & Wei-Kai Liu & Pranoot Hatwar & Marcel Salathe & Joe Byrum, 2019. "Investment Ranking Challenge: Identifying the best performing stocks based on their semi-annual returns," Papers 1906.08636, arXiv.org.
    11. repec:spr:fininn:v:5:y:2019:i:1:d:10.1186_s40854-019-0125-5 is not listed on IDEAS
    12. Clegg, Matthew & Krauss, Christopher, 2016. "Pairs trading with partial cointegration," FAU Discussion Papers in Economics 05/2016, Friedrich-Alexander University Erlangen-Nuremberg, Institute for Economics.
    13. repec:taf:quantf:v:18:y:2018:i:10:p:1735-1751 is not listed on IDEAS
    14. repec:eee:intfor:v:35:y:2019:i:1:p:390-407 is not listed on IDEAS
    15. repec:eee:pacfin:v:53:y:2019:i:c:p:186-207 is not listed on IDEAS
    16. Masaya Abe & Hideki Nakayama, 2018. "Deep Learning for Forecasting Stock Returns in the Cross-Section," Papers 1801.01777, arXiv.org, revised Jun 2018.
    17. repec:gam:jjrfmx:v:12:y:2019:i:1:p:31-:d:205554 is not listed on IDEAS
    18. Fischer, Thomas & Krauss, Christopher, 2017. "Deep learning with long short-term memory networks for financial market predictions," FAU Discussion Papers in Economics 11/2017, Friedrich-Alexander University Erlangen-Nuremberg, Institute for Economics.
    19. Sang Il Lee & Seong Joon Yoo, 2017. "Threshold-Based Portfolio: The Role of the Threshold and Its Applications," Papers 1709.09822, arXiv.org, revised Aug 2018.
    20. repec:eee:ejores:v:277:y:2019:i:1:p:351-365 is not listed on IDEAS
    21. Adriano Soares Koshiyama & Nick Firoozye & Philip Treleaven, 2018. "A Machine Learning-based Recommendation System for Swaptions Strategies," Papers 1810.02125, arXiv.org.
    22. Fischer, Thomas G., 2018. "Reinforcement learning in financial markets - a survey," FAU Discussion Papers in Economics 12/2018, Friedrich-Alexander University Erlangen-Nuremberg, Institute for Economics.

    More about this item

    Keywords

    statistical arbitrage; deep learning; gradient-boosting; random forests; ensemble learning;

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:zbw:iwqwdp:032016. See general information about how to correct material in RePEc.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: (ZBW - Leibniz Information Centre for Economics). General contact details of provider: http://edirc.repec.org/data/vierlde.html .

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service hosted by the Research Division of the Federal Reserve Bank of St. Louis . RePEc uses bibliographic data supplied by the respective publishers.