IDEAS home Printed from
   My bibliography  Save this article

Predicting movie success with machine learning techniques: ways to improve accuracy


  • Kyuhan Lee

    (Seoul National University)

  • Jinsoo Park

    () (Seoul National University)

  • Iljoo Kim

    (Saint Joseph’s University)

  • Youngseok Choi

    (Brunel University)


Abstract Previous studies on predicting the box-office performance of a movie using machine learning techniques have shown practical levels of predictive accuracy. Their works are technically- and methodologically-oriented, focusing mainly on what algorithms are better at predicting the movie performance. However, the accuracy of prediction model can also be elevated by taking other perspectives such as introducing unexplored features that might be related to the prediction of the outcomes. In this paper, we examine multiple approaches to improve the performance of the prediction model. First, we develop and add a new feature derived from the theory of transmedia storytelling. Such theory-driven feature selection not only increases the forecast accuracy, but also enhances the interpretability of a prediction model. Second, we use an ensemble approach, which has rarely been adopted in the research on predicting box-office performance. As a result, the proposed model, Cinema Ensemble Model (CEM), outperforms the prediction models from the past studies that use machine learning algorithms. We suggest that CEM can be extensively used for industrial experts as a powerful tool for improving decision-making process.

Suggested Citation

  • Kyuhan Lee & Jinsoo Park & Iljoo Kim & Youngseok Choi, 0. "Predicting movie success with machine learning techniques: ways to improve accuracy," Information Systems Frontiers, Springer, vol. 0, pages 1-12.
  • Handle: RePEc:spr:infosf:v::y::i::d:10.1007_s10796-016-9689-z
    DOI: 10.1007/s10796-016-9689-z

    Download full text from publisher

    File URL:
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    1. Ravid, S Abraham, 1999. "Information, Blockbusters, and Stars: A Study of the Film Industry," The Journal of Business, University of Chicago Press, vol. 72(4), pages 463-492, October.
    2. Jehoshua Eliashberg & Anita Elberse & Mark A.A.M. Leenders, 2006. "The Motion Picture Industry: Critical Issues in Practice, Current Research, and New Research Directions," Marketing Science, INFORMS, vol. 25(6), pages 638-661, 11-12.
    3. Jehoshua Eliashberg & Sam K. Hui & Z. John Zhang, 2007. "From Story Line to Box Office: A New Approach for Green-Lighting Movie Scripts," Management Science, INFORMS, vol. 53(6), pages 881-893, June.
    4. Julianne Treme & Lee A. Craig, 2013. "Celebrity star power: Do age and gender effects influence box office performance?," Applied Economics Letters, Taylor & Francis Journals, vol. 20(5), pages 440-445, March.
    5. Randy Nelson & Robert Glotfelty, 2012. "Movie stars and box office revenues: an empirical analysis," Journal of Cultural Economics, Springer;The Association for Cultural Economics International, vol. 36(2), pages 141-166, May.
    6. Laura Auria & Rouslan A. Moro, 2008. "Support Vector Machines (SVM) as a Technique for Solvency Analysis," Discussion Papers of DIW Berlin 811, DIW Berlin, German Institute for Economic Research.
    7. Arthur De Vany & W. Walls, 1999. "Uncertainty in the Movie Industry: Does Star Power Reduce the Terror of the Box Office?," Journal of Cultural Economics, Springer;The Association for Cultural Economics International, vol. 23(4), pages 285-318, November.
    8. Tirtha Dhar & Guanghui Sun & Charles Weinberg, 2012. "The long-term box office performance of sequel movies," Marketing Letters, Springer, vol. 23(1), pages 13-29, March.
    Full references (including those not matched with items on IDEAS)


    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:infosf:v::y::i::d:10.1007_s10796-016-9689-z. See general information about how to correct material in RePEc.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: (Sonal Shukla) or (Rebekah McClure). General contact details of provider: .

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service hosted by the Research Division of the Federal Reserve Bank of St. Louis . RePEc uses bibliographic data supplied by the respective publishers.