IDEAS home Printed from https://ideas.repec.org/p/zbw/fubsbe/338080.html

Applying generative adversarial networks to generate synthetic train trip data for train delay prediction

Author

Listed:
  • Hauck, Florian
  • Güth, Albrecht
  • Kliewer, Natalia
  • Rößler-von Saß, David

Abstract

This paper examines the possibilities of creating synthetic train trip data with Generative Adversarial Networks (GANs). A real data set from Deutsche Bahn is enhanced with synthetic data created by using a Conditional Wasserstein Generative Adversarial Network (CWGAN). The synthetic data is analyzed and compared with the original data using statistical methods as well as machine learning models. The results show that the synthetic data is very similar to the original data in terms of data structure and dependencies, but at the same time contains enough noise to not just copy already existing instances. To analyze and measure the quality of the synthetic data, different supervised machine learning models are trained to predict the change of delay of trains at a specific station based on the arrival delays of other trains at that station. These models are then each trained once using the real data and once using the real data enhanced by synthetic data. All models are evaluated using a test set containing only real data that was not used to train the models. The results show that the R2 value of delay predictions increases significantly when using the enhanced data set. In particular, neural network-based models can benefit from the larger amount of input data. The proposed approach of generating synthetic train trip data with a CWGAN can also be applied to various other railway data analysis projects that require a large amount of input data. In addition, the presented approach is particularly interesting because, unlike most GAN approaches discussed in current literature, the data basis contains numerical data and not image data.

Suggested Citation

  • Hauck, Florian & Güth, Albrecht & Kliewer, Natalia & Rößler-von Saß, David, 2026. "Applying generative adversarial networks to generate synthetic train trip data for train delay prediction," Discussion Papers 2026/7, Free University Berlin, School of Business & Economics.
  • Handle: RePEc:zbw:fubsbe:338080
    DOI: 10.17169/refubium-51427
    as

    Download full text from publisher

    File URL: https://www.econstor.eu/bitstream/10419/338080/1/1963615352.pdf
    Download Restriction: no

    File URL: https://libkey.io/10.17169/refubium-51427?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    More about this item

    Keywords

    ;
    ;
    ;

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:zbw:fubsbe:338080. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ZBW - Leibniz Information Centre for Economics (email available below). General contact details of provider: https://edirc.repec.org/data/fwfubde.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.