IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2506.03780.html

High-Dimensional Learning in Finance

Author

Listed:
  • Hasan Fallahgoul

Abstract

Recent advances in machine learning have shown promising results for financial prediction using large, over-parameterized models. This paper provides theoretical foundations and empirical validation for understanding when and how these methods achieve predictive success. I examine two key aspects of high-dimensional learning in finance. First, I prove that within-sample standardization in Random Fourier Features implementations fundamentally alters the underlying Gaussian kernel approximation, replacing shift-invariant kernels with training-set dependent alternatives. Second, I establish information-theoretic lower bounds that identify when reliable learning is impossible no matter how sophisticated the estimator. A detailed quantitative calibration of the polynomial lower bound shows that with typical parameter choices, e.g., 12,000 features, 12 monthly observations, and R-square 2-3%, the required sample size to escape the bound exceeds 25-30 years of data--well beyond any rolling-window actually used. Thus, observed out-of-sample success must originate from lower-complexity artefacts rather than from the intended high-dimensional mechanism.

Suggested Citation

  • Hasan Fallahgoul, 2025. "High-Dimensional Learning in Finance," Papers 2506.03780, arXiv.org, revised Jul 2025.
  • Handle: RePEc:arx:papers:2506.03780
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2506.03780
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Luyang Chen & Markus Pelger & Jason Zhu, 2024. "Deep Learning in Asset Pricing," Management Science, INFORMS, vol. 70(2), pages 714-750, February.
    2. Daniele Bianchi & Matthias Büchner & Tobias Hoogteijling & Andrea Tamoni, 2021. "Corrigendum: Bond Risk Premiums with Machine Learning [Bond risk premiums with machine learning]," The Review of Financial Studies, Society for Financial Studies, vol. 34(2), pages 1090-1103.
    3. Shihao Gu & Bryan Kelly & Dacheng Xiu, 2020. "Empirical Asset Pricing via Machine Learning," Review of Finance, European Finance Association, vol. 33(5), pages 2223-2273.
    4. Guanhao Feng & Stefano Giglio & Dacheng Xiu, 2020. "Taming the Factor Zoo: A Test of New Factors," Journal of Finance, American Finance Association, vol. 75(3), pages 1327-1370, June.
    5. Bryan Kelly & Semyon Malamud & Kangying Zhou, 2024. "The Virtue of Complexity in Return Prediction," Journal of Finance, American Finance Association, vol. 79(1), pages 459-503, February.
    6. Ivo Welch & Amit Goyal, 2008. "A Comprehensive Look at The Empirical Performance of Equity Premium Prediction," The Review of Financial Studies, Society for Financial Studies, vol. 21(4), pages 1455-1508, July.
    7. Shihao Gu & Bryan Kelly & Dacheng Xiu, 2020. "Empirical Asset Pricing via Machine Learning," The Review of Financial Studies, Society for Financial Studies, vol. 33(5), pages 2223-2273.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Changeun Kim & Younwoo Jeong & Bong-Gyu Jang, 2025. "Interpretable Deep Learning for Stock Returns: A Consensus-Bottleneck Asset Pricing Model," Papers 2512.16251, arXiv.org, revised Apr 2026.
    2. Cong, Lin William & Feng, Guanhao & He, Jingyu & He, Xin, 2025. "Growing the efficient frontier on panel trees," Journal of Financial Economics, Elsevier, vol. 167(C).
    3. Cheng, Tingting & Jiang, Shan & Zhao, Albert Bo & Zhao, Junyi, 2025. "Is machine learning a necessity? A regression-based approach for stock return prediction," Journal of Empirical Finance, Elsevier, vol. 81(C).
    4. Chai, Bailin & Jiang, Fuwei & Lin, Yihao & You, Tian, 2025. "Predicting bond risk premiums with machine learning: Evidence from China," Pacific-Basin Finance Journal, Elsevier, vol. 93(C).
    5. Lin William Cong & Guanhao Feng & Jingyu He & Xin He, 2022. "Growing the Efficient Frontier on Panel Trees," NBER Working Papers 30805, National Bureau of Economic Research, Inc.
    6. Jinbo Cai & Wenze Li & Wenjie Wang, 2025. "Electricity Market Predictability: Virtues of Machine Learning and Links to the Macroeconomy," Papers 2507.07477, arXiv.org.
    7. Bui, Dien Giau & Kong, De-Rong & Lin, Chih-Yung & Lin, Tse-Chun, 2023. "Momentum in machine learning: Evidence from the Taiwan stock market," Pacific-Basin Finance Journal, Elsevier, vol. 82(C).
    8. Faria, Gonçalo & Verona, Fabio, 2024. "Enhancing forecast accuracy through frequencydomain combination: Applications to financial and economic indicators," Bank of Finland Research Discussion Papers 14/2024, Bank of Finland.
    9. Bhaskar Goswami & Ajim Uddin, 2026. "Significance of predictors: revisiting stock return predictions using explainable AI," Annals of Operations Research, Springer, vol. 357(1), pages 223-257, February.
    10. Fallahgoul, Hasan & Franstianto, Vincentius & Lin, Xin, 2024. "Asset pricing with neural networks: Significance tests," Journal of Econometrics, Elsevier, vol. 238(1).
    11. Branco, Rafael R. & Rubesam, Alexandre & Zevallos, Mauricio, 2024. "Forecasting realized volatility: Does anything beat linear models?," Journal of Empirical Finance, Elsevier, vol. 78(C).
    12. Wu, Haoran & Gao, Zhiwei & Nie, Boyang & Zhao, Binru, 2025. "Can machines learn Chinese mutual funds?," Pacific-Basin Finance Journal, Elsevier, vol. 94(C).
    13. Hoang, Daniel & Wiegratz, Kevin, 2022. "Machine learning methods in finance: Recent applications and prospects," Working Paper Series in Economics 158, Karlsruhe Institute of Technology (KIT), Department of Economics and Management.
    14. Mykola Babiak & Jozef Barunik, 2020. "Deep Learning, Predictability, and Optimal Portfolio Returns," Papers 2009.03394, arXiv.org, revised Feb 2026.
    15. Li, Bin & Rossi, Alberto G. & Yan, Xuemin (Sterling) & Zheng, Lingling, 2025. "Machine learning from a “Universe” of signals: The role of feature engineering," Journal of Financial Economics, Elsevier, vol. 172(C).
    16. Feng, Guanhao & He, Xin & Wang, Yanchu & Wu, Chunchi, 2025. "Predicting individual corporate bond returns," Journal of Banking & Finance, Elsevier, vol. 171(C).
    17. Eghbal Rahimikia & Stefan Zohren & Ser-Huang Poon, 2021. "Realised Volatility Forecasting: Machine Learning via Financial Word Embedding," Papers 2108.00480, arXiv.org, revised Apr 2026.
    18. Rad, Hossein & Low, Rand Kwong Yew & Miffre, Joëlle & Faff, Robert, 2023. "The commodity risk premium and neural networks," Journal of Empirical Finance, Elsevier, vol. 74(C).
    19. Shunyao Wang & Ming Cheng & Christina Dan Wang, 2025. "NewsNet-SDF: Stochastic Discount Factor Estimation with Pretrained Language Model News Embeddings via Adversarial Networks," Papers 2505.06864, arXiv.org.
    20. Borup, Daniel & Christensen, Bent Jesper & Mühlbach, Nicolaj Søndergaard & Nielsen, Mikkel Slot, 2023. "Targeting predictors in random forest regression," International Journal of Forecasting, Elsevier, vol. 39(2), pages 841-868.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2506.03780. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.