IDEAS home Printed from https://ideas.repec.org/p/bca/bocawp/23-53.html
   My bibliography  Save this paper

Identifying Nascent High-Growth Firms Using Machine Learning

Author

Listed:
  • Stephanie Houle
  • Ryan Macdonald

Abstract

Predicting which firms will grow quickly and why has been the subject of research studies for many decades. Firms that grow rapidly have the potential to usher in new innovations, products or processes (Kogan et al. 2017), become superstar firms (Haltiwanger et al. 2013) and impact the aggregate labour share (Autor et al. 2020; De Loecker et al. 2020). We explore the use of supervised machine learning techniques to identify a population of nascent high-growth firms using Canadian administrative firm-level data. We apply a suite of supervised machine learning algorithms (elastic net model, random forest and neural net) to determine whether a large set of variables on Canadian firm tax filing financial and employment data, state variables (e.g., industry, geography) and indicators of firm complexity (e.g., multiple industrial activities, foreign ownership) can predict which firms will be high-growth firms over the next three years. The results suggest that the machine learning classifiers can select a sub-population of nascent high-growth firms that includes the majority of actual high-growth firms plus a group of firms that shared similar attributes but failed to attain high-growth status.

Suggested Citation

  • Stephanie Houle & Ryan Macdonald, 2023. "Identifying Nascent High-Growth Firms Using Machine Learning," Staff Working Papers 23-53, Bank of Canada.
  • Handle: RePEc:bca:bocawp:23-53
    as

    Download full text from publisher

    File URL: https://www.bankofcanada.ca/wp-content/uploads/2023/10/swp2023-53.pdf
    File Function: Full text
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. van Buuren, Stef & Groothuis-Oudshoorn, Karin, 2011. "mice: Multivariate Imputation by Chained Equations in R," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 45(i03).
    2. Aaron Chalfin & Oren Danieli & Andrew Hillis & Zubin Jelveh & Michael Luca & Jens Ludwig & Sendhil Mullainathan, 2016. "Productivity and Selection of Human Capital with Machine Learning," American Economic Review, American Economic Association, vol. 106(5), pages 124-127, May.
    3. MIYAKAWA Daisuke & MIYAUCHI Yuhei & Christian PEREZ, 2017. "Forecasting Firm Performance with Machine Learning: Evidence from Japanese firm-level data," Discussion papers 17068, Research Institute of Economy, Trade and Industry (RIETI).
    4. Leonid Kogan & Dimitris Papanikolaou & Amit Seru & Noah Stoffman, 2017. "Technological Innovation, Resource Allocation, and Growth," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 132(2), pages 665-712.
    5. Alex Coad & Stjepan Srhoj, 2020. "Catching Gazelles with a Lasso: Big data techniques for the prediction of high-growth firms," Small Business Economics, Springer, vol. 55(3), pages 541-565, October.
    6. Magnus Henrekson & Dan Johansson, 2010. "Gazelles as job creators: a survey and interpretation of the evidence," Small Business Economics, Springer, vol. 35(2), pages 227-244, September.
    7. Sven-Olov Daunfeldt & Daniel Halvarsson, 2015. "Are high-growth firms one-hit wonders? Evidence from Sweden," Small Business Economics, Springer, vol. 44(2), pages 361-383, February.
    8. Alex Coad & Sven-Olov Daunfeldt & Werner Hölzl & Dan Johansson & Paul Nightingale, 2014. "High-growth firms: introduction to the special section," Industrial and Corporate Change, Oxford University Press and the Associazione ICC, vol. 23(1), pages 91-112, February.
    9. Jan De Loecker & Jan Eeckhout & Gabriel Unger, 2020. "The Rise of Market Power and the Macroeconomic Implications [“Econometric Tools for Analyzing Market Outcomes”]," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 135(2), pages 561-644.
    10. Davis, Steven J & Haltiwanger, John & Schuh, Scott, 1996. "Small Business and Job Creation: Dissecting the Myth and Reassessing the Facts," Small Business Economics, Springer, vol. 8(4), pages 297-315, August.
    11. Wright, Marvin N. & Ziegler, Andreas, 2017. "ranger: A Fast Implementation of Random Forests for High Dimensional Data in C++ and R," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 77(i01).
    12. Orietta Marsili, 2001. "The Anatomy and Evolution of Industries," Books, Edward Elgar Publishing, number 2272.
    13. Sendhil Mullainathan & Jann Spiess, 2017. "Machine Learning: An Applied Econometric Approach," Journal of Economic Perspectives, American Economic Association, vol. 31(2), pages 87-106, Spring.
    14. Hui Zou & Trevor Hastie, 2005. "Addendum: Regularization and variable selection via the elastic net," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 67(5), pages 768-768, November.
    15. Hui Zou & Trevor Hastie, 2005. "Regularization and variable selection via the elastic net," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 67(2), pages 301-320, April.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Silviano Esteve-Pérez & Fabio Pieri & Diego Rodriguez, 2022. "One swallow does not make a summer: episodes and persistence in high growth," Small Business Economics, Springer, vol. 58(3), pages 1517-1544, March.
    2. Edward I. Altman & Marco Balzano & Alessandro Giannozzi & Stjepan Srhoj, 2023. "Revisiting SME default predictors: The Omega Score," Journal of Small Business Management, Taylor & Francis Journals, vol. 61(6), pages 2383-2417, November.
    3. Coad, Alex & Srhoj, Stjepan, 2023. "Entrepreneurial ecosystems and regional persistence of high growth firms: A ‘broken clock’ critique," Research Policy, Elsevier, vol. 52(6).
    4. Erhardt, Eva, 2017. "Who persistently creates jobs? Absolute versus relative high-growth firms," MPRA Paper 79307, University Library of Munich, Germany.
    5. Ari Hyytinen & Petri Rouvinen & Mika Pajarinen & Joosua Virtanen, 2023. "Ex Ante Predictability of Rapid Growth: A Design Science Approach," Entrepreneurship Theory and Practice, , vol. 47(6), pages 2465-2493, November.
    6. Alex Coad & Stjepan Srhoj, 2020. "Catching Gazelles with a Lasso: Big data techniques for the prediction of high-growth firms," Small Business Economics, Springer, vol. 55(3), pages 541-565, October.
    7. Erhardt, Eva Christine, 2018. "Firm performance after high growth: A comparison of absolute and relative growth measures," MPRA Paper 88077, University Library of Munich, Germany.
    8. Alex Coad & Clemens Domnick & Florian Flachenecker & Peter Harasztosi & Mario Lorenzo Janiri & Rozalia Pal & Mercedes Teruel, 2022. "Capacity constraints as a trigger for high growth," Small Business Economics, Springer, vol. 59(3), pages 893-923, October.
    9. Falco J. Bargagli-Stoffi & Jan Niederreiter & Massimo Riccaboni, 2020. "Supervised learning for the prediction of firm dynamics," Papers 2009.06413, arXiv.org.
    10. Eva Christine Erhardt, 2022. "Prevalence and Persistence of High-Growth Entrepreneurship: Which Institutions Matter Most?," Journal of Industry, Competition and Trade, Springer, vol. 22(2), pages 297-332, June.
    11. Cristina Fernández & Roberta García & Paloma Lopez-Garcia & Benedicta Marzinotto & Roberta Serafini & Juuso Vanhala & Ladislav Wintr, 2017. "Firm growth in Europe: An overview based on the COMPNET labour module," BCL working papers 107, Central Bank of Luxembourg.
    12. Christopher J Greenwood & George J Youssef & Primrose Letcher & Jacqui A Macdonald & Lauryn J Hagg & Ann Sanson & Jenn Mcintosh & Delyse M Hutchinson & John W Toumbourou & Matthew Fuller-Tyszkiewicz &, 2020. "A comparison of penalised regression methods for informing the selection of predictive markers," PLOS ONE, Public Library of Science, vol. 15(11), pages 1-14, November.
    13. Florian Léon, 2022. "The elusive quest for high-growth firms in Africa: when other metrics of performance say nothing," Small Business Economics, Springer, vol. 58(1), pages 225-246, January.
    14. Christopher Kath & Florian Ziel, 2018. "The value of forecasts: Quantifying the economic gains of accurate quarter-hourly electricity price forecasts," Papers 1811.08604, arXiv.org.
    15. Halko, Marja-Liisa & Lappalainen, Olli & Sääksvuori, Lauri, 2021. "Do non-choice data reveal economic preferences? Evidence from biometric data and compensation-scheme choice," Journal of Economic Behavior & Organization, Elsevier, vol. 188(C), pages 87-104.
    16. Suriyan Jomthanachai & Wai Peng Wong & Khai Wah Khaw, 2024. "An Application of Machine Learning to Logistics Performance Prediction: An Economics Attribute-Based of Collective Instance," Computational Economics, Springer;Society for Computational Economics, vol. 63(2), pages 741-792, February.
    17. Michael Lechner, 2023. "Causal Machine Learning and its use for public policy," Swiss Journal of Economics and Statistics, Springer;Swiss Society of Economics and Statistics, vol. 159(1), pages 1-15, December.
    18. Kath, Christopher & Ziel, Florian, 2018. "The value of forecasts: Quantifying the economic gains of accurate quarter-hourly electricity price forecasts," Energy Economics, Elsevier, vol. 76(C), pages 411-423.
    19. Daniele Moschella & Federico Tamagni & Xiaodan Yu, 2019. "Persistent high-growth firms in China’s manufacturing," Small Business Economics, Springer, vol. 52(3), pages 573-594, March.
    20. Diego F. Grijalva & Valeria Ayala & Paúl A. Ponce & Yelitza Pontón, 2018. "Does firm innovation lead to high growth? Evidence from Ecuadorian firms," Revista Cuadernos de Economia, Universidad Nacional de Colombia, FCE, CID, vol. 37(75), pages 697-726, May.

    More about this item

    Keywords

    Econometric and statistical methods; Firm dynamics;

    JEL classification:

    • C55 - Mathematical and Quantitative Methods - - Econometric Modeling - - - Large Data Sets: Modeling and Analysis
    • C81 - Mathematical and Quantitative Methods - - Data Collection and Data Estimation Methodology; Computer Programs - - - Methodology for Collecting, Estimating, and Organizing Microeconomic Data; Data Access
    • L25 - Industrial Organization - - Firm Objectives, Organization, and Behavior - - - Firm Performance

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bca:bocawp:23-53. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: the person in charge (email available below). General contact details of provider: https://edirc.repec.org/data/bocgvca.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.