IDEAS home Printed from https://ideas.repec.org/p/ven/wpaper/202611.html

Machine Learning techniques for synthetic data generation in Energy and Financial Markets

Author

Listed:
  • Oleksandr Castello

    (Ca’ Foscari University of Venice)

  • Marco Corazza

    (Ca’ Foscari University of Venice)

Abstract

The availability of sufficiently large, reliable, and high-quality datasets represents a fundamental prerequisite for quantitative analysis and data-driven decision-making in economics and finance. In practice, however, financial data are often limited, noisy, or subject to restricted access, creating significant empirical constraints for both researchers and practitioners. Recent advances in Generative Machine Learning (GenML) provide promising tools to overcome these limitations by enabling the generation of synthetic data capable of preserving the main statistical features of original data. Despite the rapid diffusion of these techniques, most existing studies focus on replicating stylized facts of financial time series or producing forward-looking simulations, while less attention has been devoted to a systematic assessment of the generative fidelity and generalization capacity of alternative models across different distributional environments. Motivated by this gap, this study provides a comparative evaluation of several Deep Generative Machine Learning (Deep-GenML) families by assessing their ability to reproduce both theoretical statistical distributions and empirical financial and commodity market data. The analysis spans multiple Deep-GenML architectures, distributional settings and market regimes, while also examining model performance under alternative training configurations that reflect varying degrees of data availability. The empirical evidence indicates that deep generative models are capable of accurately reproducing complex distributional features—including heavy tails, asymmetry, and multimodality—across a wide range of scenarios. Overall, the results highlight the potential of deep generative approaches as flexible tools for synthetic data generation and distributional modeling in financial and energy market applications.

Suggested Citation

  • Oleksandr Castello & Marco Corazza, 2026. "Machine Learning techniques for synthetic data generation in Energy and Financial Markets," Working Papers 2026: 11, Department of Economics, University of Venice "Ca' Foscari".
  • Handle: RePEc:ven:wpaper:2026:11
    as

    Download full text from publisher

    File URL: https://www.unive.it/web/fileadmin/user_upload/dipartimenti/DEC/doc/Pubblicazioni_scientifiche/working_papers/2026/WP_DSE_castello_corazza_11_26.pdf
    File Function: First version, anno
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Saksham Jain & Gautam Seth & Arpit Paruthi & Umang Soni & Girish Kumar, 2022. "Synthetic data augmentation for surface defect detection and classification using deep learning," Journal of Intelligent Manufacturing, Springer, vol. 33(4), pages 1007-1020, April.
    2. Alvaro Figueira & Bruno Vaz, 2022. "Survey on Synthetic Data Generation, Evaluation Methods and GANs," Mathematics, MDPI, vol. 10(15), pages 1-41, August.
    3. Liu, Dinggao & Chen, Kaijie & Cai, Yi & Tang, Zhenpeng, 2024. "Interpretable EU ETS Phase 4 prices forecasting based on deep generative data augmentation approach," Finance Research Letters, Elsevier, vol. 61(C).
    4. Fernando Pacheco & Gabriel Hermosilla & Osvaldo Piña & Gabriel Villavicencio & Héctor Allende-Cid & Juan Palma & Pamela Valenzuela & José García & Alex Carpanetti & Vinicius Minatogawa & Gonzalo Suazo, 2022. "Generation of Synthetic Data for the Analysis of the Physical Stability of Tailing Dams through Artificial Intelligence," Mathematics, MDPI, vol. 10(23), pages 1-15, November.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Erica Espinosa & Alvaro Figueira, 2023. "On the Quality of Synthetic Generated Tabular Data," Mathematics, MDPI, vol. 11(15), pages 1-18, July.
    2. Miles V. Bimrose & Tianxiang Hu & Davis J. McGregor & Jiongxin Wang & Sameh Tawfick & Chenhui Shao & Zuozhu Liu & William P. King, 2025. "Detecting and classifying hidden defects in additively manufactured parts using deep learning and X-ray computed tomography," Journal of Intelligent Manufacturing, Springer, vol. 36(5), pages 3465-3479, June.
    3. Cui, Jinxin & Maghyereh, Aktham, 2025. "Examining perceived spillovers among climate risk, fossil fuel, renewable energy, and carbon markets: A higher-order moment and quantile analysis," Journal of Commodity Markets, Elsevier, vol. 38(C).
    4. Isack Farady & Chih-Yang Lin & Ming-Ching Chang, 2024. "PreAugNet: improve data augmentation for industrial defect classification with small-scale training data," Journal of Intelligent Manufacturing, Springer, vol. 35(3), pages 1233-1246, March.
    5. Ansari Saleh Ahmar & Eva Boj del Val, 2026. "HybridSutte Technology for Economic Policy: Innovation in Import Forecasting and Trade Management in Emerging Markets," SN Operations Research Forum, Springer, vol. 7(2), pages 1-27, June.
    6. Shouhong Chen & Zhentao Huang & Tao Wang & Xingna Hou & Jun Ma, 2025. "Wafer map defect recognition based on multi-scale feature fusion and attention spatial pyramid pooling," Journal of Intelligent Manufacturing, Springer, vol. 36(1), pages 271-284, January.
    7. Guo, Cong & Jiang, Yaoqin & Yang, Yitong & Yuan, Zhilu & Guo, Renzhong, 2025. "Bringing realism: Enhancing high-dimensional data for active behavior analysis in older adults," Journal of Transport Geography, Elsevier, vol. 129(C).
    8. Li Wei & Mahmud Iwan Solihin & Sarah ‘Atifah Saruchi & Winda Astuti & Lim Wei Hong & Ang Chun Kit, 2024. "Surface Defects Detection of Cylindrical High-Precision Industrial Parts Based on Deep Learning Algorithms: A Review," SN Operations Research Forum, Springer, vol. 5(3), pages 1-71, September.
    9. Wang, Jia & Cao, Yuan & Xiong, Xiong, 2025. "Multiscale dependence and risk contagion between European carbon market, energy, and financial markets," Energy, Elsevier, vol. 335(C).
    10. Chen, Zhiqiang & Li, Jianbin & Cheng, Long & Liu, Xiufeng, 2023. "Federated-WDCGAN: A federated smart meter data sharing framework for privacy preservation," Applied Energy, Elsevier, vol. 334(C).
    11. Hayrullah Urcan & Emine Cengil & Murat Canayaz, 2025. "Comparative analysis of TGAN and other GAN models for synthetic earthquake data: a case study with data from Türkiye," Natural Hazards: Journal of the International Society for the Prevention and Mitigation of Natural Hazards, Springer;International Society for the Prevention and Mitigation of Natural Hazards, vol. 121(16), pages 19239-19259, September.
    12. Yu Gong & Xiaoqiao Wang & Chichun Zhou & Maogen Ge & Conghu Liu & Xi Zhang, 2025. "Human–machine knowledge hybrid augmentation method for surface defect detection based few-data learning," Journal of Intelligent Manufacturing, Springer, vol. 36(3), pages 1723-1742, March.
    13. Songling Huang & Lisha Peng & Hongyu Sun & Shisong Li, 2023. "Deep Learning for Magnetic Flux Leakage Detection and Evaluation of Oil & Gas Pipelines: A Review," Energies, MDPI, vol. 16(3), pages 1-27, January.
    14. Ansari Saleh Ahmar, 2026. "Financial Innovation in Time Series Forecasting: HybridSutte’s Enhanced Predictive Performance for Indonesia’s Import Values," SN Operations Research Forum, Springer, vol. 7(1), pages 1-31, March.
    15. Haotian Zhang & Stuart Dereck Semujju & Zhicheng Wang & Xianwei Lv & Kang Xu & Liang Wu & Ye Jia & Jing Wu & Wensheng Liang & Ruiyan Zhuang & Zhuo Long & Ruijun Ma & Xiaoguang Ma, 2026. "Large scale foundation models for intelligent manufacturing applications: a survey," Journal of Intelligent Manufacturing, Springer, vol. 37(1), pages 119-170, January.
    16. Zhan, Lei & Li, Guannan & Xu, Chengliang & Ren, Haoshan & Sun, Yongjun, 2025. "Experience knowledge decomposition – Data generation: Enhanced multi-step short-term cooling load predictions in data centres with data shortage issues," Energy, Elsevier, vol. 328(C).
    17. Changyun Wei & Yuhang Bao & Chengwei Zheng & Ze Ji, 2026. "AMFNet: aggregated multi-level feature interaction fusion network for defect detection on steel surfaces," Journal of Intelligent Manufacturing, Springer, vol. 37(4), pages 1615-1632, April.
    18. Ruining Tang & Zhenyu Liu & Yiguo Song & Guifang Duan & Jianrong Tan, 2024. "Hierarchical multi-scale network for cross-scale visual defect detection," Journal of Intelligent Manufacturing, Springer, vol. 35(3), pages 1141-1157, March.

    More about this item

    Keywords

    ;
    ;
    ;
    ;
    ;
    ;

    JEL classification:

    • C45 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods: Special Topics - - - Neural Networks and Related Topics
    • C46 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods: Special Topics - - - Specific Distributions
    • C58 - Mathematical and Quantitative Methods - - Econometric Modeling - - - Financial Econometrics
    • C63 - Mathematical and Quantitative Methods - - Mathematical Methods; Programming Models; Mathematical and Simulation Modeling - - - Computational Techniques

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:ven:wpaper:2026:11. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sassano Sonia (email available below). General contact details of provider: https://edirc.repec.org/data/dsvenit.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.