High-Dimensional Learning in Finance

My bibliography Save this paper

High-Dimensional Learning in Finance

Author

Listed:

Hasan Fallahgoul

Registered:

Abstract

Recent advances in machine learning have shown promising results for financial prediction using large, over-parameterized models. This paper provides theoretical foundations and empirical validation for understanding when and how these methods achieve predictive success. I examine two key aspects of high-dimensional learning in finance. First, I prove that within-sample standardization in Random Fourier Features implementations fundamentally alters the underlying Gaussian kernel approximation, replacing shift-invariant kernels with training-set dependent alternatives. Second, I establish information-theoretic lower bounds that identify when reliable learning is impossible no matter how sophisticated the estimator. A detailed quantitative calibration of the polynomial lower bound shows that with typical parameter choices, e.g., 12,000 features, 12 monthly observations, and R-square 2-3%, the required sample size to escape the bound exceeds 25-30 years of data--well beyond any rolling-window actually used. Thus, observed out-of-sample success must originate from lower-complexity artefacts rather than from the intended high-dimensional mechanism.

Suggested Citation

Hasan Fallahgoul, 2025. "High-Dimensional Learning in Finance," Papers 2506.03780, arXiv.org, revised Jul 2025.

Handle: RePEc:arx:papers:2506.03780

Download full text from publisher

References listed on IDEAS

Luyang Chen & Markus Pelger & Jason Zhu, 2024. "Deep Learning in Asset Pricing," Management Science, INFORMS, vol. 70(2), pages 714-750, February.
- Luyang Chen & Markus Pelger & Jason Zhu, 2019. "Deep Learning in Asset Pricing," Papers 1904.00745, arXiv.org, revised Aug 2021.
Daniele Bianchi & Matthias Büchner & Andrea Tamoni, 2021. "Bond Risk Premiums with Machine Learning [Quadratic term structure models: Theory and evidence]," The Review of Financial Studies, Society for Financial Studies, vol. 34(2), pages 1046-1089.
- Daniele Bianchi & Matthias Büchner & Tobias Hoogteijling & Andrea Tamoni, 2021. "Corrigendum: Bond Risk Premiums with Machine Learning [Bond risk premiums with machine learning]," The Review of Financial Studies, Society for Financial Studies, vol. 34(2), pages 1090-1103.
Shihao Gu & Bryan Kelly & Dacheng Xiu, 2020. "Empirical Asset Pricing via Machine Learning," Review of Finance, European Finance Association, vol. 33(5), pages 2223-2273.
Guanhao Feng & Stefano Giglio & Dacheng Xiu, 2020. "Taming the Factor Zoo: A Test of New Factors," Journal of Finance, American Finance Association, vol. 75(3), pages 1327-1370, June.
- Guanhao Feng & Stefano Giglio & Dacheng Xiu, 2019. "Taming the Factor Zoo: A Test of New Factors," NBER Working Papers 25481, National Bureau of Economic Research, Inc.
- Giglio, Stefano & Feng, Guanhao & Xiu, Dacheng, 2020. "Taming the Factor Zoo: A Test of New Factors," CEPR Discussion Papers 14266, C.E.P.R. Discussion Papers.
Shihao Gu & Bryan Kelly & Dacheng Xiu, 2020. "Empirical Asset Pricing via Machine Learning," The Review of Financial Studies, Society for Financial Studies, vol. 33(5), pages 2223-2273.
- Shihao Gu & Bryan T. Kelly & Dacheng Xiu, 2018. "Empirical Asset Pricing via Machine Learning," Swiss Finance Institute Research Paper Series 18-71, Swiss Finance Institute.
- Shihao Gu & Bryan Kelly & Dacheng Xiu, 2018. "Empirical Asset Pricing via Machine Learning," NBER Working Papers 25398, National Bureau of Economic Research, Inc.
Bryan Kelly & Semyon Malamud & Kangying Zhou, 2024. "The Virtue of Complexity in Return Prediction," Journal of Finance, American Finance Association, vol. 79(1), pages 459-503, February.

Full references (including those not matched with items on IDEAS)

Most related items

These are the items that most often cite the same works as this one and are cited by the same works as this one.

Cong, Lin William & Feng, Guanhao & He, Jingyu & He, Xin, 2025. "Growing the efficient frontier on panel trees," Journal of Financial Economics, Elsevier, vol. 167(C).
Lin William Cong & Guanhao Feng & Jingyu He & Xin He, 2022. "Growing the Efficient Frontier on Panel Trees," NBER Working Papers 30805, National Bureau of Economic Research, Inc.
- Lin William Cong & Guanhao Feng & Jingyu He & Xin He, 2025. "Growing the Efficient Frontier on Panel Trees," Papers 2501.16730, arXiv.org, revised Feb 2025.
Cheng, Tingting & Jiang, Shan & Zhao, Albert Bo & Zhao, Junyi, 2025. "Is machine learning a necessity? A regression-based approach for stock return prediction," Journal of Empirical Finance, Elsevier, vol. 81(C).
Eghbal Rahimikia & Stefan Zohren & Ser-Huang Poon, 2021. "Realised Volatility Forecasting: Machine Learning via Financial Word Embedding," Papers 2108.00480, arXiv.org, revised Nov 2024.
Jiajun Gu & Zichen Yang & Xintong Lin & Sixun Chen & YuTing Lu, 2024. "AI-Enhanced Factor Analysis for Predicting S&P 500 Stock Dynamics," Papers 2412.12438, arXiv.org.
Qian, Yihe & Zhang, Yang, 2025. "Long-term forecasting in asset pricing: Machine learning models’ sensitivity to macroeconomic shifts and firm-specific factors," The North American Journal of Economics and Finance, Elsevier, vol. 78(C).
Victor DeMiguel & Javier Gil-Bazo & Francisco J. Nogales & André A. P. Santos, 2021. "Can machine learning help to select portfolios of mutual funds?," Economics Working Papers 1772, Department of Economics and Business, Universitat Pompeu Fabra.
Jinbo Cai & Wenze Li & Wenjie Wang, 2025. "Electricity Market Predictability: Virtues of Machine Learning and Links to the Macroeconomy," Papers 2507.07477, arXiv.org.
Branco, Rafael R. & Rubesam, Alexandre & Zevallos, Mauricio, 2024. "Forecasting realized volatility: Does anything beat linear models?," Journal of Empirical Finance, Elsevier, vol. 78(C).
- Rafael Branco & Alexandre Rubesam & Mauricio Zevallos, 2024. "Forecasting realized volatility: Does anything beat linear models?," Post-Print hal-04835657, HAL.
Bui, Dien Giau & Kong, De-Rong & Lin, Chih-Yung & Lin, Tse-Chun, 2023. "Momentum in machine learning: Evidence from the Taiwan stock market," Pacific-Basin Finance Journal, Elsevier, vol. 82(C).
Victor DeMiguel & Javier Gil-Bazo & Francisco J. Nogales & André A. P. Santos, 2021. "Can Machine Learning Help to Select Portfolios of Mutual Funds?," Working Papers 1245, Barcelona School of Economics.
Matteo Bagnara, 2024. "Asset Pricing and Machine Learning: A critical review," Journal of Economic Surveys, Wiley Blackwell, vol. 38(1), pages 27-56, February.
Faria, Gonçalo & Verona, Fabio, 2024. "Enhancing forecast accuracy through frequencydomain combination: Applications to financial and economic indicators," Bank of Finland Research Discussion Papers 14/2024, Bank of Finland.
Mykola Babiak & Jozef Barunik, 2020. "Deep Learning, Predictability, and Optimal Portfolio Returns," CERGE-EI Working Papers wp677, The Center for Economic Research and Graduate Education - Economics Institute, Prague.
- Mykola Babiak & Jozef Barunik, 2020. "Deep Learning, Predictability, and Optimal Portfolio Returns," Papers 2009.03394, arXiv.org, revised Jul 2021.
Kaniel, Ron & Lin, Zihan & Pelger, Markus & Van Nieuwerburgh, Stijn, 2023. "Machine-learning the skill of mutual fund managers," Journal of Financial Economics, Elsevier, vol. 150(1), pages 94-138.
- Ron Kaniel & Zihan Lin & Markus Pelger & Stijn Van Nieuwerburgh, 2022. "Machine-Learning the Skill of Mutual Fund Managers," NBER Working Papers 29723, National Bureau of Economic Research, Inc.
- Kaniel, Ron & Lin, Zihan & Pelger, Markus & Van Nieuwerburgh, Stijn, 2023. "Machine-Learning the Skill of Mutual Fund Managers," CEPR Discussion Papers 18129, C.E.P.R. Discussion Papers.
Fallahgoul, Hasan & Franstianto, Vincentius & Lin, Xin, 2024. "Asset pricing with neural networks: Significance tests," Journal of Econometrics, Elsevier, vol. 238(1).
Hoang, Daniel & Wiegratz, Kevin, 2022. "Machine learning methods in finance: Recent applications and prospects," Working Paper Series in Economics 158, Karlsruhe Institute of Technology (KIT), Department of Economics and Management.
Shanyan Lai, 2025. "Multilayer Perceptron Neural Network Models in Asset Pricing: An Empirical Study on Large-Cap US Stocks," Papers 2505.01921, arXiv.org, revised May 2025.
Jorge Guijarro-Ordonez & Markus Pelger & Greg Zanotti, 2021. "Deep Learning Statistical Arbitrage," Papers 2106.04028, arXiv.org, revised Oct 2022.
Jiang, Hao & Li, Sophia Zhengzi & Yuan, Peixuan, 2025. "Granular information and sectoral movements," Journal of Economic Dynamics and Control, Elsevier, vol. 171(C).

More about this item

NEP fields

This paper has been announced in the following NEP Reports:

NEP-BIG-2025-06-23 (Big Data)
NEP-ECM-2025-06-23 (Econometrics)
NEP-FOR-2025-06-23 (Forecasting)
NEP-MAC-2025-06-23 (Macroeconomics)

Statistics

Access and download statistics

Corrections

All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2506.03780. See general information about how to correct material in RePEc.

If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

Please note that corrections may take a couple of weeks to filter through the various RePEc services.

IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.

Browse Econ Literature

More features

High-Dimensional Learning in Finance

Author

Abstract

Suggested Citation

Download full text from publisher

References listed on IDEAS

Most related items

More about this item

NEP fields

Statistics

Corrections

More services and features

MyIDEAS

Author registration

Rankings

RePEc Genealogy

RePEc Biblio

MPRA

New papers by email

EconAcademics

Plagiarism

About RePEc

RePEc home

Blog

Help/FAQ

RePEc team

Participating archives

Privacy statement

Help us

Corrections

Volunteers

Get papers listed

Open a RePEc archive

Get RePEc data