IDEAS home Printed from
   My bibliography  Save this article

Are box office revenues equally unpredictable for all movies? Evidence from a Random forest-based model


  • Evgeny A. Antipov

    () (National Research University Higher School of Economics)

  • Elena B. Pokryshevskaya

    (National Research University Higher School of Economics)


In this study we develop a model for early box office receipts forecasting that, in addition to traditionally used regressors, uses several inputs that have never been used before, but appeared to be very useful predictors according to our variable importance analysis. New predictors account for the power of actors and directors, as well as for the intensity of competition at the time of movie release. Instead of Motion Picture of Association of America (MPAA) ratings commonly used in movie success prediction, textual information about the reasons for giving a movie its MPAA rating was formalized using word frequency and principal components analyses. The expert system is based on the Random forest algorithm, which outperformed a stepwise regression and a multilayer perceptron neural network. A regression tree-based diagnostic approach allowed us to detect the heterogeneity of model accuracy across segments of data and assess the applicability of the model to different movie types.

Suggested Citation

  • Evgeny A. Antipov & Elena B. Pokryshevskaya, 2017. "Are box office revenues equally unpredictable for all movies? Evidence from a Random forest-based model," Journal of Revenue and Pricing Management, Palgrave Macmillan, vol. 16(3), pages 295-307, June.
  • Handle: RePEc:pal:jorapm:v:16:y:2017:i:3:d:10.1057_s41272-016-0072-y
    DOI: 10.1057/s41272-016-0072-y

    Download full text from publisher

    File URL:
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    1. Kim, Taegu & Hong, Jungsik & Kang, Pilsung, 2015. "Box office forecasting using machine learning algorithms based on SNS data," International Journal of Forecasting, Elsevier, vol. 31(2), pages 364-390.
    2. Jehoshua Eliashberg & Sam K. Hui & Z. John Zhang, 2007. "From Story Line to Box Office: A New Approach for Green-Lighting Movie Scripts," Management Science, INFORMS, vol. 53(6), pages 881-893, June.
    3. Hyndman, Rob J. & Koehler, Anne B., 2006. "Another look at measures of forecast accuracy," International Journal of Forecasting, Elsevier, vol. 22(4), pages 679-688.
    4. Flores, Benito E, 1986. "A pragmatic view of accuracy measurement in forecasting," Omega, Elsevier, vol. 14(2), pages 93-98.
    5. Antipov, Evgeny & Pokryshevskaya, Elena, 2010. "Accounting for latent classes in movie box office modeling," MPRA Paper 27644, University Library of Munich, Germany.
    Full references (including those not matched with items on IDEAS)


    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:pal:jorapm:v:16:y:2017:i:3:d:10.1057_s41272-016-0072-y. See general information about how to correct material in RePEc.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: (Sonal Shukla) or (Springer Nature Abstracting and Indexing). General contact details of provider: .

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service hosted by the Research Division of the Federal Reserve Bank of St. Louis . RePEc uses bibliographic data supplied by the respective publishers.