IDEAS home Printed from https://ideas.repec.org/h/elg/eechap/19238_22.html

Predicting match outcomes in football by an Ordered Forest estimator

In: A Modern Guide to Sports Economics

Author

Listed:
  • Daniel Goller
  • Michael C. Knaus
  • Michael Lechner
  • Gabriel Okasa

Abstract

Predicting the outcome of football (i.e. soccer) games based on past information is a non-standard predictive task because of the nature of the game outcome, as well as because of the importance of uncertainty (luck and unobservables). The game outcome consists of the scores of the two teams that are usually either collapsed into a goal-difference or further aggregated to reflect whether the game ended as a win for the home or away team, or as a draw. From a statistical perspective, such outcomes have bounded support and, thus, standard linear modelling can be expected to perform poorly. The large amount of uncertainty in the game outcomes due to just luck or due to game- or team-specific unobservables (e.g. hidden injuries of players, etc.) makes it imperative to use prediction methods that fully exploit the potential of the available information, as well as to uncover the uncertainty of a match outcome. The latter is also relevant when interest is not only in single games but also in a league table at the end of the season. Obviously, such league tables should capture the uncertainty for the single games accumulated over a season to be useful guides on what to expect. Recently, machine learning methods have shown their power in all sorts of prediction problems, in particular in situations where the relation of the variables capturing the information used to predict with the target of the prediction, i.e. here the outcome of the game, is non-linear. However, so far there has been only little development in gearing these methods explicitly towards the estimation of the probabilities of ordered outcomes, such as score differences and points, or just wins, draws, and losses. Lechner and Okasa (2019) propose adapting classical random forest estimation, which is known to have excellent predictive performance (e.g. Biau and Scornet (2016), Fernández-Delgado et al. (2014)) to the problem of predicting probabilities of ordered categorical outcomes, such as the win-draw-loss problem of a football game. In this chapter, we use their approach to predict game outcomes of the German Bundesliga 1 (BL1) based on more than ten years' data on game outcomes as well as extensive information about teams, their players, and their environment. These predictions are then used to obtain the final season rankings in a way that reflects and shows the magnitude of the inherent uncertainty of football games.

Suggested Citation

  • Daniel Goller & Michael C. Knaus & Michael Lechner & Gabriel Okasa, 2021. "Predicting match outcomes in football by an Ordered Forest estimator," Chapters, in: Ruud H. Koning & Stefan Kesenne (ed.), A Modern Guide to Sports Economics, chapter 22, pages 335-355, Edward Elgar Publishing.
  • Handle: RePEc:elg:eechap:19238_22
    as

    Download full text from publisher

    File URL: https://www.elgaronline.com/view/edcoll/9781789906523/9781789906523.00026.xml
    Download Restriction: no
    ---><---

    Other versions of this item:

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. is not listed on IDEAS
    2. Michael Lechner & Gabriel Okasa, 2025. "Random Forest estimation of the ordered choice model," Empirical Economics, Springer, vol. 68(1), pages 1-106, January.
    3. Daniel Goller, 2023. "Analysing a built-in advantage in asymmetric darts contests using causal machine learning," Annals of Operations Research, Springer, vol. 325(1), pages 649-679, June.

    More about this item

    Keywords

    ;

    JEL classification:

    • Z29 - Other Special Topics - - Sports Economics - - - Other
    • C53 - Mathematical and Quantitative Methods - - Econometric Modeling - - - Forecasting and Prediction Models; Simulation Methods

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:elg:eechap:19238_22. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Darrel McCalla (email available below). General contact details of provider: http://www.e-elgar.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.