IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2604.15531.html

Spurious Predictability in Financial Machine Learning

Author

Listed:
  • Sotirios D. Nikolopoulos

Abstract

Adaptive specification search generates statistically significant backtests even under martingale-difference nulls. We introduce a falsification audit testing complete predictive workflows against synthetic reference classes, including zero-predictability environments and microstructure placebos. Workflows generating significant walk-forward evidence in these environments are falsified. For passing workflows, we quantify selection-induced performance inflation using an absolute magnitude gap linking optimized in-sample evidence to disjoint walk-forward realizations, adjusted for effective multiplicity. Simulations validate extreme-value scaling under correlated searches and demonstrate detection power under genuine structure. Empirical case studies confirm that many apparent findings represent methodological artifacts rather than genuine predictability.

Suggested Citation

  • Sotirios D. Nikolopoulos, 2026. "Spurious Predictability in Financial Machine Learning," Papers 2604.15531, arXiv.org.
  • Handle: RePEc:arx:papers:2604.15531
    as

    Download full text from publisher

    File URL: https://arxiv.org/pdf/2604.15531
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Whitney Newey & Kenneth West, 2014. "A simple, positive semi-definite, heteroscedasticity and autocorrelation consistent covariance matrix," Applied Econometrics, Russian Presidential Academy of National Economy and Public Administration (RANEPA), vol. 33(1), pages 125-132.
    2. Carhart, Mark M, 1997. "On Persistence in Mutual Fund Performance," Journal of Finance, American Finance Association, vol. 52(1), pages 57-82, March.
    3. Bergmeir, Christoph & Hyndman, Rob J. & Koo, Bonsoo, 2018. "A note on the validity of cross-validation for evaluating autoregressive time series prediction," Computational Statistics & Data Analysis, Elsevier, vol. 120(C), pages 70-83.
    4. Shihao Gu & Bryan Kelly & Dacheng Xiu, 2020. "Empirical Asset Pricing via Machine Learning," Review of Finance, European Finance Association, vol. 33(5), pages 2223-2273.
    5. Alexandre Belloni & Victor Chernozhukov & Christian Hansen, 2014. "Inference on Treatment Effects after Selection among High-Dimensional Controlsâ€," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 81(2), pages 608-650.
    6. Hal R. Varian, 2014. "Big Data: New Tricks for Econometrics," Journal of Economic Perspectives, American Economic Association, vol. 28(2), pages 3-28, Spring.
    7. Jules H van Binsbergen & Xiao Han & Alejandro Lopez-Lira, 2023. "Man versus Machine Learning: The Term Structure of Earnings Expectations and Conditional Biases," The Review of Financial Studies, Society for Financial Studies, vol. 36(6), pages 2361-2396.
    8. Hansen, Peter Reinhard, 2005. "A Test for Superior Predictive Ability," Journal of Business & Economic Statistics, American Statistical Association, vol. 23, pages 365-380, October.
    9. Whitney K. Newey & Kenneth D. West, 1994. "Automatic Lag Selection in Covariance Matrix Estimation," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 61(4), pages 631-653.
    10. Theis Ingerslev Jensen & Bryan Kelly & Lasse Heje Pedersen, 2023. "Is There a Replication Crisis in Finance?," Journal of Finance, American Finance Association, vol. 78(5), pages 2465-2518, October.
    11. Fama, Eugene F. & French, Kenneth R., 1993. "Common risk factors in the returns on stocks and bonds," Journal of Financial Economics, Elsevier, vol. 33(1), pages 3-56, February.
    12. Shao, Xiaofeng, 2010. "The Dependent Wild Bootstrap," Journal of the American Statistical Association, American Statistical Association, vol. 105(489), pages 218-235.
    13. Benoit Mandelbrot, 2015. "The Variation of Certain Speculative Prices," World Scientific Book Chapters, in: Anastasios G Malliaris & William T Ziemba (ed.), THE WORLD SCIENTIFIC HANDBOOK OF FUTURES MARKETS, chapter 3, pages 39-78, World Scientific Publishing Co. Pte. Ltd..
    14. R. Cont, 2001. "Empirical properties of asset returns: stylized facts and statistical issues," Quantitative Finance, Taylor & Francis Journals, vol. 1(2), pages 223-236.
    15. Francis X. Diebold, 2015. "Comparing Predictive Accuracy, Twenty Years Later: A Personal Perspective on the Use and Abuse of Diebold-Mariano Tests," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 33(1), pages 1-1, January.
    16. Shihao Gu & Bryan Kelly & Dacheng Xiu, 2020. "Empirical Asset Pricing via Machine Learning," The Review of Financial Studies, Society for Financial Studies, vol. 33(5), pages 2223-2273.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Petropoulos, Fotios & Apiletti, Daniele & Assimakopoulos, Vassilios & Babai, Mohamed Zied & Barrow, Devon K. & Ben Taieb, Souhaib & Bergmeir, Christoph & Bessa, Ricardo J. & Bijak, Jakub & Boylan, Joh, 2022. "Forecasting: theory and practice," International Journal of Forecasting, Elsevier, vol. 38(3), pages 705-871.
      • Fotios Petropoulos & Daniele Apiletti & Vassilios Assimakopoulos & Mohamed Zied Babai & Devon K. Barrow & Souhaib Ben Taieb & Christoph Bergmeir & Ricardo J. Bessa & Jakub Bijak & John E. Boylan & Jet, 2020. "Forecasting: theory and practice," Papers 2012.03854, arXiv.org, revised Jan 2022.
    2. Blanco, Ivan & De Jesus, Miguel & Remesal, Alvaro, 2023. "Overlapping momentum portfolios," Journal of Empirical Finance, Elsevier, vol. 72(C), pages 1-22.
    3. Beckmeyer, Heiner & Wiedemann, Timo, 2025. "All Days Are Not Created Equal: Understanding Momentum by Learning to Weight Past Returns," Journal of Banking & Finance, Elsevier, vol. 181(C).
    4. Fausch, Jürg & Frigg, Moreno & Ruenzi, Stefan & Weigert, Florian, 2026. "Machine learning mutual fund flows," CFR Working Papers 26-03, University of Cologne, Centre for Financial Research (CFR).
    5. Changeun Kim & Younwoo Jeong & Bong-Gyu Jang, 2025. "Interpretable Deep Learning for Stock Returns: A Consensus-Bottleneck Asset Pricing Model," Papers 2512.16251, arXiv.org, revised Apr 2026.
    6. Obaid, Khaled & Pukthuanthong, Kuntara, 2022. "A picture is worth a thousand words: Measuring investor sentiment by combining machine learning and photos from news," Journal of Financial Economics, Elsevier, vol. 144(1), pages 273-297.
    7. Paul Handro & Bogdan Dima, 2024. "Analyzing Financial Markets Efficiency: Insights from a Bibliometric and Content Review," Journal of Financial Studies, Institute of Financial Studies, vol. 16(9), pages 119-175, May.
    8. DeMiguel, Victor & Gil-Bazo, Javier & Nogales, Francisco J. & Santos, André A.P., 2023. "Machine learning and fund characteristics help to select mutual funds with positive alpha," Journal of Financial Economics, Elsevier, vol. 150(3).
    9. Jiaju Miao & Pawel Polak, 2023. "Online Ensemble Learning for Sector Rotation: A Gradient-Free Framework," Papers 2304.09947, arXiv.org, revised Nov 2025.
    10. Doron Avramov & Si Cheng & Lior Metzker, 2023. "Machine Learning vs. Economic Restrictions: Evidence from Stock Return Predictability," Management Science, INFORMS, vol. 69(5), pages 2587-2619, May.
    11. Allen Yikuan Huang & Zheqi Fan, 2026. "Beyond Prompting: An Autonomous Framework for Systematic Factor Investing via Agentic AI," Papers 2603.14288, arXiv.org, revised Apr 2026.
    12. Lioui, Abraham & Tarelli, Andrea, 2022. "Chasing the ESG factor," Journal of Banking & Finance, Elsevier, vol. 139(C).
    13. Min, Byoung-Kyu & Roh, Tai-Yong, 2025. "Can machine learning uncover abnormal returns in uncharted financial territories?," Pacific-Basin Finance Journal, Elsevier, vol. 94(C).
    14. Ricardo P. Masini & Marcelo C. Medeiros & Eduardo F. Mendes, 2023. "Machine learning advances for time series forecasting," Journal of Economic Surveys, Wiley Blackwell, vol. 37(1), pages 76-111, February.
    15. Guillaume Chevalier & Guillaume Coqueret & Thomas Raffinot, 2022. "Supervised portfolios," Post-Print hal-04144588, HAL.
    16. Tian Ma & Cunfei Liao & Fuwei Jiang, 2023. "Timing the factor zoo via deep learning: Evidence from China," Accounting and Finance, Accounting and Finance Association of Australia and New Zealand, vol. 63(1), pages 485-505, March.
    17. Cakici, Nusret & Zaremba, Adam, 2021. "Liquidity and the cross-section of international stock returns," Journal of Banking & Finance, Elsevier, vol. 127(C).
    18. Hai Lin & Pengfei Liu & Cheng Zhang, 2023. "The trend premium around the world: Evidence from the stock market," International Review of Finance, International Review of Finance Ltd., vol. 23(2), pages 317-358, June.
    19. Cakici, Nusret & Fieberg, Christian & Metko, Daniel & Zaremba, Adam, 2023. "Machine learning goes global: Cross-sectional return predictability in international stock markets," Journal of Economic Dynamics and Control, Elsevier, vol. 155(C).
    20. Victor DeMiguel & Javier Gil-Bazo & Francisco J. Nogales & André A. P. Santos, 2021. "Can machine learning help to select portfolios of mutual funds?," Economics Working Papers 1772, Department of Economics and Business, Universitat Pompeu Fabra.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2604.15531. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: https://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.