IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2512.21791.html

Synthetic Financial Data Generation for Enhanced Financial Modelling

Author

Listed:
  • Christophe D. Hounwanou
  • Yae Ulrich Gaba
  • Pierre Ntakirutimana

Abstract

Data scarcity and confidentiality in finance often impede model development and robust testing. This paper presents a unified multi-criteria evaluation framework for synthetic financial data and applies it to three representative generative paradigms: the statistical ARIMA-GARCH baseline, Variational Autoencoders (VAEs), and Time-series Generative Adversarial Networks (TimeGAN). Using historical S and P 500 daily data, we evaluate fidelity (Maximum Mean Discrepancy, MMD), temporal structure (autocorrelation and volatility clustering), and practical utility in downstream tasks, specifically mean-variance portfolio optimization and volatility forecasting. Empirical results indicate that ARIMA-GARCH captures linear trends and conditional volatility but fails to reproduce nonlinear dynamics; VAEs produce smooth trajectories that underestimate extreme events; and TimeGAN achieves the best trade-off between realism and temporal coherence (e.g., TimeGAN attained the lowest MMD: 1.84e-3, average over 5 seeds). Finally, we articulate practical guidelines for selecting generative models according to application needs and computational constraints. Our unified evaluation protocol and reproducible codebase aim to standardize benchmarking in synthetic financial data research.

Suggested Citation

  • Christophe D. Hounwanou & Yae Ulrich Gaba & Pierre Ntakirutimana, 2025. "Synthetic Financial Data Generation for Enhanced Financial Modelling," Papers 2512.21791, arXiv.org.
  • Handle: RePEc:arx:papers:2512.21791
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2512.21791
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Magnus Wiese & Robert Knobloch & Ralf Korn & Peter Kretschmer, 2020. "Quant GANs: deep generation of financial time series," Quantitative Finance, Taylor & Francis Journals, vol. 20(9), pages 1419-1440, September.
    2. Bollerslev, Tim, 1986. "Generalized autoregressive conditional heteroskedasticity," Journal of Econometrics, Elsevier, vol. 31(3), pages 307-327, April.
    3. R. Cont, 2001. "Empirical properties of asset returns: stylized facts and statistical issues," Quantitative Finance, Taylor & Francis Journals, vol. 1(2), pages 223-236.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Weilong Fu & Ali Hirsa & Jorg Osterrieder, 2022. "Simulating financial time series using attention," Papers 2207.00493, arXiv.org.
    2. Yuki Tanaka & Ryuji Hashimoto & Takehiro Takayanagi & Zhe Piao & Yuri Murayama & Kiyoshi Izumi, 2025. "CoFinDiff: Controllable Financial Diffusion Model for Time Series Generation," Papers 2503.04164, arXiv.org.
    3. Xiaoyu Tan & Zili Zhang & Xuejun Zhao & Shuyi Wang, 2022. "DeepPricing: pricing convertible bonds based on financial time-series generative adversarial networks," Financial Innovation, Springer;Southwestern University of Finance and Economics, vol. 8(1), pages 1-38, December.
    4. Pieter Nel & Renee van Eyden, 2026. "From News to Noise: Does Media Sentiment Drive Stock Market Volatility?," Working Papers 202605, University of Pretoria, Department of Economics.
    5. Zou, Yongjie & Li, Honggang, 2014. "Time spans between price maxima and price minima in stock markets," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 395(C), pages 303-309.
    6. Yang, Xiaoqi & Vagnani, Gianluca & Dong, Yan & Ji, Xu, 2024. "Short selling and firms’ long-term stock return volatility: Evidence from Chinese concept stocks in Hong Kong," Finance Research Letters, Elsevier, vol. 70(C).
    7. Kei Nakagawa & Yusuke Uchiyama, 2020. "GO-GJRSK Model with Application to Higher Order Risk-Based Portfolio," Mathematics, MDPI, vol. 8(11), pages 1-12, November.
    8. Takaishi, Tetsuya, 2017. "Rational GARCH model: An empirical test for stock returns," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 473(C), pages 451-460.
    9. Vincenzo Candila, 2013. "A Comparison of the Forecasting Performances of Multivariate Volatility Models," Working Papers 3_228, Dipartimento di Scienze Economiche e Statistiche, Università degli Studi di Salerno.
    10. Pierre J. Venter & Eben Maré, 2020. "GARCH Generated Volatility Indices of Bitcoin and CRIX," JRFM, MDPI, vol. 13(6), pages 1-15, June.
    11. Ñíguez, Trino-Manuel & Perote, Javier, 2017. "Moments expansion densities for quantifying financial risk," The North American Journal of Economics and Finance, Elsevier, vol. 42(C), pages 53-69.
    12. Takaishi, Tetsuya, 2025. "Multifractality and sample size influence on Bitcoin volatility patterns," Finance Research Letters, Elsevier, vol. 74(C).
    13. Krishnamurthy, Vikram & Leoff, Elisabeth & Sass, Jörn, 2018. "Filterbased stochastic volatility in continuous-time hidden Markov models," Econometrics and Statistics, Elsevier, vol. 6(C), pages 1-21.
    14. Kei Nakagawa & Masanori Hirano & Kentaro Minami & Takanobu Mizuta, 2024. "A Multi-agent Market Model Can Explain the Impact of AI Traders in Financial Markets -- A New Microfoundations of GARCH model," Papers 2409.12516, arXiv.org.
    15. Lux, Thomas & Morales-Arias, Leonardo & Sattarhoff, Cristina, 2011. "A Markov-switching multifractal approach to forecasting realized volatility," Kiel Working Papers 1737, Kiel Institute for the World Economy.
    16. Yudong Wang & Chongfeng Wu, 2013. "Efficiency of Crude Oil Futures Markets: New Evidence from Multifractal Detrending Moving Average Analysis," Computational Economics, Springer;Society for Computational Economics, vol. 42(4), pages 393-414, December.
    17. Paul Handro & Bogdan Dima, 2024. "Analyzing Financial Markets Efficiency: Insights from a Bibliometric and Content Review," Journal of Financial Studies, Institute of Financial Studies, vol. 16(9), pages 119-175, May.
    18. Edmond Lezmi & Jules Roche & Thierry Roncalli & Jiali Xu, 2020. "Improving the Robustness of Trading Strategy Backtesting with Boltzmann Machines and Generative Adversarial Networks," Papers 2007.04838, arXiv.org.
    19. Ardelean, Vlad & Pleier, Thomas, 2013. "Outliers & predicting time series: A comparative study," FAU Discussion Papers in Economics 05/2013, Friedrich-Alexander University Erlangen-Nuremberg, Institute for Economics.
    20. Mikhail Makushkin & Victor Lapshin, 2023. "Dynamic Nelson–Siegel model for market risk estimation of bonds: Practical implementation," Applied Econometrics, Russian Presidential Academy of National Economy and Public Administration (RANEPA), vol. 69, pages 5-27.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2512.21791. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.