IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2504.06566.html
   My bibliography  Save this paper

Diffusion Factor Models: Generating High-Dimensional Returns with Factor Structure

Author

Listed:
  • Minshuo Chen
  • Renyuan Xu
  • Yumin Xu
  • Ruixun Zhang

Abstract

Financial scenario simulation is essential for risk management and portfolio optimization, yet it remains challenging especially in high-dimensional and small data settings common in finance. We propose a diffusion factor model that integrates latent factor structure into generative diffusion processes, bridging econometrics with modern generative AI to address the challenges of the curse of dimensionality and data scarcity in financial simulation. By exploiting the low-dimensional factor structure inherent in asset returns, we decompose the score function--a key component in diffusion models--using time-varying orthogonal projections, and this decomposition is incorporated into the design of neural network architectures. We derive rigorous statistical guarantees, establishing nonasymptotic error bounds for both score estimation at O(d^{5/2} n^{-2/(k+5)}) and generated distribution at O(d^{5/4} n^{-1/2(k+5)}), primarily driven by the intrinsic factor dimension k rather than the number of assets d, surpassing the dimension-dependent limits in the classical nonparametric statistics literature and making the framework viable for markets with thousands of assets. Numerical studies confirm superior performance in latent subspace recovery under small data regimes. Empirical analysis demonstrates the economic significance of our framework in constructing mean-variance optimal portfolios and factor portfolios. This work presents the first theoretical integration of factor structure with diffusion models, offering a principled approach for high-dimensional financial simulation with limited data. Our code is available at https://github.com/xymmmm00/diffusion_factor_model.

Suggested Citation

  • Minshuo Chen & Renyuan Xu & Yumin Xu & Ruixun Zhang, 2025. "Diffusion Factor Models: Generating High-Dimensional Returns with Factor Structure," Papers 2504.06566, arXiv.org, revised May 2025.
  • Handle: RePEc:arx:papers:2504.06566
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2504.06566
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Ledoit, Olivier & Wolf, Michael, 2004. "A well-conditioned estimator for large-dimensional covariance matrices," Journal of Multivariate Analysis, Elsevier, vol. 88(2), pages 365-411, February.
    2. Luyang Chen & Markus Pelger & Jason Zhu, 2024. "Deep Learning in Asset Pricing," Management Science, INFORMS, vol. 70(2), pages 714-750, February.
    3. Stefano Giglio & Bryan Kelly & Dacheng Xiu, 2022. "Factor Models, Machine Learning, and Asset Pricing," Annual Review of Financial Economics, Annual Reviews, vol. 14(1), pages 337-368, November.
    4. Jagannathan, Ravi & Wang, Zhenyu, 1996. "The Conditional CAPM and the Cross-Section of Expected Returns," Journal of Finance, American Finance Association, vol. 51(1), pages 3-53, March.
    5. Olivier Ledoit & Michael Wolf, 2022. "The Power of (Non-)Linear Shrinking: A Review and Guide to Covariance Matrix Estimation [Design-Free Estimation of Variance Matrices]," Journal of Financial Econometrics, Oxford University Press, vol. 20(1), pages 187-218.
    6. Jianqing Fan & Yuan Liao & Martina Mincheva, 2013. "Large covariance estimation by thresholding principal orthogonal complements," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 75(4), pages 603-680, September.
    7. Shihao Gu & Bryan Kelly & Dacheng Xiu, 2020. "Empirical Asset Pricing via Machine Learning," Review of Finance, European Finance Association, vol. 33(5), pages 2223-2273.
    8. Jushan Bai & Serena Ng, 2002. "Determining the Number of Factors in Approximate Factor Models," Econometrica, Econometric Society, vol. 70(1), pages 191-221, January.
    9. Stefano Giglio & Yuan Liao & Dacheng Xiu, 2021. "Thousands of Alpha Tests," NBER Chapters, in: Big Data: Long-Term Implications for Financial Markets and Firms, pages 3456, National Bureau of Economic Research, Inc.
    10. Victor DeMiguel & Lorenzo Garlappi & Raman Uppal, 2009. "Optimal Versus Naive Diversification: How Inefficient is the 1-N Portfolio Strategy?," The Review of Financial Studies, Society for Financial Studies, vol. 22(5), pages 1915-1953, May.
    11. Gregory Connor & Matthias Hagmann & Oliver Linton, 2012. "Efficient Semiparametric Estimation of the Fama–French Model and Extensions," Econometrica, Econometric Society, vol. 80(2), pages 713-754, March.
    12. Joel Shapiro & Jing Zeng, 2024. "Stress Testing and Bank Lending," The Review of Financial Studies, Society for Financial Studies, vol. 37(4), pages 1265-1314.
    13. Redouane Elkamhi & Chanik Jo & Yoshio Nozawa, 2024. "A One-Factor Model of Corporate Bond Premia," Management Science, INFORMS, vol. 70(3), pages 1875-1900, March.
    14. Stefan Nagel, 2013. "Empirical Cross-Sectional Asset Pricing," Annual Review of Financial Economics, Annual Reviews, vol. 5(1), pages 167-199, November.
    15. Frank Fabozzi & Dashan Huang & Guofu Zhou, 2010. "Robust portfolios: contributions from operations research and finance," Annals of Operations Research, Springer, vol. 176(1), pages 191-220, April.
    16. Tu, Jun & Zhou, Guofu, 2010. "Incorporating Economic Objectives into Bayesian Priors: Portfolio Choice under Parameter Uncertainty," Journal of Financial and Quantitative Analysis, Cambridge University Press, vol. 45(4), pages 959-986, August.
    17. Milena Vuletić & Felix Prenzel & Mihai Cucuringu, 2024. "Fin-GAN: forecasting and classifying financial time series via generative adversarial networks," Quantitative Finance, Taylor & Francis Journals, vol. 24(2), pages 175-199, January.
    18. Viral V. Acharya & Richard Berner & Robert Engle & Hyeyoon Jung & Johannes Stroebel & Xuran Zeng & Yihao Zhao, 2023. "Climate Stress Testing," Annual Review of Financial Economics, Annual Reviews, vol. 15(1), pages 291-326, November.
    19. Bai, Jushan & Ng, Serena, 2023. "Approximate factor models with weaker loadings," Journal of Econometrics, Elsevier, vol. 235(2), pages 1893-1916.
    20. Florian Eckerli & Joerg Osterrieder, 2021. "Generative Adversarial Networks in finance: an overview," Papers 2106.06364, arXiv.org, revised Jul 2021.
    21. Ferson, Wayne E & Harvey, Campbell R, 1991. "The Variation of Economic Risk Premiums," Journal of Political Economy, University of Chicago Press, vol. 99(2), pages 385-415, April.
    22. Dimitrios Bisias & Mark Flood & Andrew W. Lo & Stavros Valavanis, 2012. "A Survey of Systemic Risk Analytics," Annual Review of Financial Economics, Annual Reviews, vol. 4(1), pages 255-296, October.
    23. Valentina Raponi & Cesare Robotti & Paolo Zaffaroni & Andrew Karolyi, 2020. "Testing Beta-Pricing Models Using Large Cross-Sections," The Review of Financial Studies, Society for Financial Studies, vol. 33(6), pages 2796-2842.
    24. Carol Alexander, 2005. "The Present and Future of Financial Risk Management," Journal of Financial Econometrics, Oxford University Press, vol. 3(1), pages 3-25.
    25. Kan, Raymond & Zhou, Guofu, 2007. "Optimal Portfolio Choice with Parameter Uncertainty," Journal of Financial and Quantitative Analysis, Cambridge University Press, vol. 42(3), pages 621-656, September.
    26. Carhart, Mark M, 1997. "On Persistence in Mutual Fund Performance," Journal of Finance, American Finance Association, vol. 52(1), pages 57-82, March.
    27. Victor DeMiguel & Lorenzo Garlappi & Francisco J. Nogales & Raman Uppal, 2009. "A Generalized Approach to Portfolio Optimization: Improving Performance by Constraining Portfolio Norms," Management Science, INFORMS, vol. 55(5), pages 798-812, May.
    28. Markus Behn & Rainer Haselmann & Vikrant Vig, 2022. "The Limits of Model‐Based Regulation," Journal of Finance, American Finance Association, vol. 77(3), pages 1635-1684, June.
    29. Pastor, Lubos & Stambaugh, Robert F., 2003. "Liquidity Risk and Expected Stock Returns," Journal of Political Economy, University of Chicago Press, vol. 111(3), pages 642-685, June.
    30. Martin Lettau & Sydney Ludvigson, 2001. "Consumption, Aggregate Wealth, and Expected Stock Returns," Journal of Finance, American Finance Association, vol. 56(3), pages 815-849, June.
    31. He, Zhiguo & Kelly, Bryan & Manela, Asaf, 2017. "Intermediary asset pricing: New evidence from many asset classes," Journal of Financial Economics, Elsevier, vol. 126(1), pages 1-35.
    32. Hanna Hultin & Henrik Hult & Alexandre Proutiere & Samuel Samama & Ala Tarighati, 2023. "A generative model of a limit order book using recurrent neural networks," Quantitative Finance, Taylor & Francis Journals, vol. 23(6), pages 931-958, June.
    33. Shihao Gu & Bryan Kelly & Dacheng Xiu, 2020. "Empirical Asset Pricing via Machine Learning," The Review of Financial Studies, Society for Financial Studies, vol. 33(5), pages 2223-2273.
    34. Thomas Schneider & Philip E Strahan & Jun Yang, 2023. "Bank Stress Testing: Public Interest or Regulatory Capture?," Review of Finance, European Finance Association, vol. 27(2), pages 423-467.
    35. Tobias Adrian & Erkko Etula & Tyler Muir, 2014. "Financial Intermediaries and the Cross-Section of Asset Returns," Journal of Finance, American Finance Association, vol. 69(6), pages 2557-2596, December.
    36. Stefano Giglio & Dacheng Xiu & Dake Zhang, 2025. "Test Assets and Weak Factors," Journal of Finance, American Finance Association, vol. 80(1), pages 259-319, February.
    37. Chen, Nai-Fu & Roll, Richard & Ross, Stephen A, 1986. "Economic Forces and the Stock Market," The Journal of Business, University of Chicago Press, vol. 59(3), pages 383-403, July.
    38. Ledoit, Olivier & Wolf, Michael, 2003. "Improved estimation of the covariance matrix of stock returns with an application to portfolio selection," Journal of Empirical Finance, Elsevier, vol. 10(5), pages 603-621, December.
    39. Matteo Bagnara, 2024. "Asset Pricing and Machine Learning: A critical review," Journal of Economic Surveys, Wiley Blackwell, vol. 38(1), pages 27-56, February.
    40. Doron Avramov & Guofu Zhou, 2010. "Bayesian Portfolio Analysis," Annual Review of Financial Economics, Annual Reviews, vol. 2(1), pages 25-47, December.
    41. Kewei Hou & Chen Xue & Lu Zhang, 2015. "Editor's Choice Digesting Anomalies: An Investment Approach," The Review of Financial Studies, Society for Financial Studies, vol. 28(3), pages 650-705.
    42. Jianqing Fan & Yuan Liao & Han Liu, 2016. "An overview of the estimation of large covariance and precision matrices," Econometrics Journal, Royal Economic Society, vol. 19(1), pages 1-32, February.
    43. Fama, Eugene F. & French, Kenneth R., 2015. "A five-factor asset pricing model," Journal of Financial Economics, Elsevier, vol. 116(1), pages 1-22.
    44. Stefano Giglio & Dacheng Xiu, 2021. "Asset Pricing with Omitted Factors," Journal of Political Economy, University of Chicago Press, vol. 129(7), pages 1947-1990.
    45. Fama, Eugene F. & French, Kenneth R., 1993. "Common risk factors in the returns on stocks and bonds," Journal of Financial Economics, Elsevier, vol. 33(1), pages 3-56, February.
    46. Ben Hambly & Renyuan Xu & Huining Yang, 2021. "Recent Advances in Reinforcement Learning in Finance," Papers 2112.04553, arXiv.org, revised Feb 2023.
    47. Anders Max Reppen & Halil Mete Soner, 2023. "Deep empirical risk minimization in finance: Looking into the future," Mathematical Finance, Wiley Blackwell, vol. 33(1), pages 116-145, January.
    48. Motohiro Yogo, 2006. "A Consumption‐Based Explanation of Expected Stock Returns," Journal of Finance, American Finance Association, vol. 61(2), pages 539-580, April.
    49. Eugene F. Fama & Kenneth R. French, 2004. "The Capital Asset Pricing Model: Theory and Evidence," Journal of Economic Perspectives, American Economic Association, vol. 18(3), pages 25-46, Summer.
    50. Yukun Liu & Aleh Tsyvinski & Xi Wu, 2022. "Common Risk Factors in Cryptocurrency," Journal of Finance, American Finance Association, vol. 77(2), pages 1133-1177, April.
    51. Büchner, Matthias & Kelly, Bryan, 2022. "A factor model for option returns," Journal of Financial Economics, Elsevier, vol. 143(3), pages 1140-1161.
    52. Yacine Aït-Sahalia & Dacheng Xiu, 2019. "Principal Component Analysis of High-Frequency Data," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 114(525), pages 287-303, January.
    53. Ben Hambly & Renyuan Xu & Huining Yang, 2023. "Recent advances in reinforcement learning in finance," Mathematical Finance, Wiley Blackwell, vol. 33(3), pages 437-503, July.
    54. Jegadeesh, Narasimhan & Titman, Sheridan, 1993. "Returns to Buying Winners and Selling Losers: Implications for Stock Market Efficiency," Journal of Finance, American Finance Association, vol. 48(1), pages 65-91, March.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Svetlana Bryzgalova & Jiantao Huang & Christian Julliard, 2023. "Bayesian Solutions for the Factor Zoo: We Just Ran Two Quadrillion Models," Journal of Finance, American Finance Association, vol. 78(1), pages 487-557, February.
    2. Bryzgalova, Svetlana & Huang, Jiantao & Julliard, Christian, 2023. "Bayesian solutions for the factor zoo: we just ran two quadrillion models," LSE Research Online Documents on Economics 126151, London School of Economics and Political Science, LSE Library.
    3. Francisco Peñaranda & Enrique Sentana, 2024. "Portfolio management with big data," Working Papers wp2024_2411, CEMFI.
    4. Matteo Bagnara, 2024. "Asset Pricing and Machine Learning: A critical review," Journal of Economic Surveys, Wiley Blackwell, vol. 38(1), pages 27-56, February.
    5. Gagliardini, Patrick & Ossola, Elisa & Scaillet, Olivier, 2019. "A diagnostic criterion for approximate factor structure," Journal of Econometrics, Elsevier, vol. 212(2), pages 503-521.
    6. Stefano Giglio & Dacheng Xiu, 2017. "Inference on Risk Premia in the Presence of Omitted Factors," NBER Working Papers 23527, National Bureau of Economic Research, Inc.
    7. Keunbae Ahn, 2021. "Predictable Fluctuations in the Cross-Section and Time-Series of Asset Prices," PhD Thesis, Finance Discipline Group, UTS Business School, University of Technology, Sydney, number 1-2021, January-A.
    8. Clarke, Charles, 2022. "The level, slope, and curve factor model for stocks," Journal of Financial Economics, Elsevier, vol. 143(1), pages 159-187.
    9. Zura Kakushadze & Willie Yu, 2016. "Multifactor Risk Models and Heterotic CAPM," Papers 1602.04902, arXiv.org, revised Mar 2016.
    10. Gagliardini, Patrick & Ossola, Elisa & Scaillet, Olivier, 2020. "Estimation of large dimensional conditional factor models in finance," Handbook of Econometrics, in: Steven N. Durlauf & Lars Peter Hansen & James J. Heckman & Rosa L. Matzkin (ed.), Handbook of Econometrics, edition 1, volume 7, chapter 0, pages 219-282, Elsevier.
    11. De Nard, Gianluca & Zhao, Zhao, 2023. "Using, taming or avoiding the factor zoo? A double-shrinkage estimator for covariance matrices," Journal of Empirical Finance, Elsevier, vol. 72(C), pages 23-35.
    12. Guanhao Feng & Stefano Giglio & Dacheng Xiu, 2020. "Taming the Factor Zoo: A Test of New Factors," Journal of Finance, American Finance Association, vol. 75(3), pages 1327-1370, June.
    13. Fieberg, Christian & Liedtke, Gerrit & Zaremba, Adam & Cakici, Nusret, 2025. "A factor model for the cross-section of country equity risk premia," Journal of Banking & Finance, Elsevier, vol. 171(C).
    14. Zura Kakushadze & Willie Yu, 2016. "Statistical Risk Models," Papers 1602.08070, arXiv.org, revised Jan 2017.
    15. Weichuan Deng & Pawel Polak & Abolfazl Safikhani & Ronakdilip Shah, 2023. "A Unified Framework for Fast Large-Scale Portfolio Optimization," Papers 2303.12751, arXiv.org, revised Nov 2023.
    16. Maio, Paulo & Philip, Dennis, 2018. "Economic activity and momentum profits: Further evidence," Journal of Banking & Finance, Elsevier, vol. 88(C), pages 466-482.
    17. Morana, Claudio, 2014. "Insights on the global macro-finance interface: Structural sources of risk factor fluctuations and the cross-section of expected stock returns," Journal of Empirical Finance, Elsevier, vol. 29(C), pages 64-79.
    18. Alessi, Lucia & Ossola, Elisa & Panzica, Roberto, 2023. "When do investors go green? Evidence from a time-varying asset-pricing model," International Review of Financial Analysis, Elsevier, vol. 90(C).
    19. Ni, Xuanming & Zheng, Tiantian & Zhao, Huimin & Zhu, Shushang, 2023. "High-dimensional portfolio optimization based on tree-structured factor model," Pacific-Basin Finance Journal, Elsevier, vol. 81(C).
    20. Zhang, Xiang & Liu, Yangyi & Wu, Kun & Maillet, Bertrand, 2021. "Tradable or nontradable factors—what does the Hansen–Jagannathan distance tell us?," International Review of Economics & Finance, Elsevier, vol. 71(C), pages 853-879.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2504.06566. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.