IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2309.01472.html
   My bibliography  Save this paper

FinDiff: Diffusion Models for Financial Tabular Data Generation

Author

Listed:
  • Timur Sattarov
  • Marco Schreyer
  • Damian Borth

Abstract

The sharing of microdata, such as fund holdings and derivative instruments, by regulatory institutions presents a unique challenge due to strict data confidentiality and privacy regulations. These challenges often hinder the ability of both academics and practitioners to conduct collaborative research effectively. The emergence of generative models, particularly diffusion models, capable of synthesizing data mimicking the underlying distributions of real-world data presents a compelling solution. This work introduces 'FinDiff', a diffusion model designed to generate real-world financial tabular data for a variety of regulatory downstream tasks, for example economic scenario modeling, stress tests, and fraud detection. The model uses embedding encodings to model mixed modality financial data, comprising both categorical and numeric attributes. The performance of FinDiff in generating synthetic tabular financial data is evaluated against state-of-the-art baseline models using three real-world financial datasets (including two publicly available datasets and one proprietary dataset). Empirical results demonstrate that FinDiff excels in generating synthetic tabular financial data with high fidelity, privacy, and utility.

Suggested Citation

  • Timur Sattarov & Marco Schreyer & Damian Borth, 2023. "FinDiff: Diffusion Models for Financial Tabular Data Generation," Papers 2309.01472, arXiv.org.
  • Handle: RePEc:arx:papers:2309.01472
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2309.01472
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Magnus Wiese & Robert Knobloch & Ralf Korn & Peter Kretschmer, 2019. "Quant GANs: Deep Generation of Financial Time Series," Papers 1907.06673, arXiv.org, revised Dec 2019.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Junyi Li & Xitong Wang & Yaoyang Lin & Arunesh Sinha & Micheal P. Wellman, 2020. "Generating Realistic Stock Market Order Streams," Papers 2006.04212, arXiv.org.
    2. Hans Buhler & Blanka Horvath & Terry Lyons & Imanol Perez Arribas & Ben Wood, 2020. "A Data-driven Market Simulator for Small Data Environments," Papers 2006.14498, arXiv.org.
    3. Rizzato, Matteo & Wallart, Julien & Geissler, Christophe & Morizet, Nicolas & Boumlaik, Noureddine, 2023. "Generative Adversarial Networks applied to synthetic financial scenarios generation," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 623(C).
    4. Magnus Wiese & Lianjun Bai & Ben Wood & Hans Buehler, 2019. "Deep Hedging: Learning to Simulate Equity Option Markets," Papers 1911.01700, arXiv.org.
    5. Ruslan Tepelyan & Achintya Gopal, 2023. "Generative Machine Learning for Multivariate Equity Returns," Papers 2311.14735, arXiv.org.
    6. Florian Eckerli & Joerg Osterrieder, 2021. "Generative Adversarial Networks in finance: an overview," Papers 2106.06364, arXiv.org, revised Jul 2021.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2309.01472. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.