IDEAS home Printed from https://ideas.repec.org/a/bla/stratm/v44y2023i7p1780-1802.html
   My bibliography  Save this article

Using supervised machine learning to scale human‐coded data: A method and dataset in the board leadership context

Author

Listed:
  • Joseph S. Harrison
  • Matthew A. Josefy
  • Matias Kalm
  • Ryan Krause

Abstract

Research Summary Human coding of unstructured text can enable scholars to measure complex latent constructs for use in empirical analysis, but also requires substantial time and resources that limit the number and sample sizes of studies using this approach. We demonstrate how supervised machine learning (ML) can overcome these constraints by allowing scholars to scale human‐coded data. Using board leadership as an illustrative context, we apply this method to create a large‐scale dataset (N = 22,388) from smaller scale human codings of CEO duality and board chair orientations from company proxy statements. We further demonstrate the potential value of this approach by using the resulting dataset to examine the relationships among board leadership, firm performance, and CEO dismissal. The ML code and dataset are available at 10.5281/zenodo.7304697. Managerial Summary Manually converting unstructured text into usable data requires considerable time and resources. This article outlines a replicable process for applying supervised machine learning (ML) to overcome these constraints by scaling manually coded data. While ML is often used to identify patterns or predict relationships within a given dataset, we show how scholars and practitioners can build valuable custom algorithms at an earlier stage in the process—when first building a dataset. We illustrate this approach by training ML algorithms to replicate human codings of CEO duality and board chair control and collaboration orientations from over 22,000 company filings. We then show how this approach can support new knowledge development by using these data to explore the relationships among board leadership, company performance, and CEO dismissal.

Suggested Citation

  • Joseph S. Harrison & Matthew A. Josefy & Matias Kalm & Ryan Krause, 2023. "Using supervised machine learning to scale human‐coded data: A method and dataset in the board leadership context," Strategic Management Journal, Wiley Blackwell, vol. 44(7), pages 1780-1802, July.
  • Handle: RePEc:bla:stratm:v:44:y:2023:i:7:p:1780-1802
    DOI: 10.1002/smj.3480
    as

    Download full text from publisher

    File URL: https://doi.org/10.1002/smj.3480
    Download Restriction: no

    File URL: https://libkey.io/10.1002/smj.3480?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Lixiong Guo & Ronald W. Masulis, 2015. "Board Structure and Monitoring: New Evidence from CEO Turnovers," Review of Financial Studies, Society for Financial Studies, vol. 28(10), pages 2770-2811.
    2. Miriam Flickinger & Markus Wrage & Anja Tuschke & Rudi Bresser, 2016. "How CEOs protect themselves against dismissal: A social status perspective," Strategic Management Journal, Wiley Blackwell, vol. 37(6), pages 1107-1117, June.
    3. Hausman, J. A. & Abrevaya, Jason & Scott-Morton, F. M., 1998. "Misclassification of the dependent variable in a discrete-response setting," Journal of Econometrics, Elsevier, vol. 87(2), pages 239-269, September.
    4. Ryan Krause, 2017. "Being the CEO's boss: An examination of board chair orientations," Strategic Management Journal, Wiley Blackwell, vol. 38(3), pages 697-713, March.
    5. Ashish Arora & Michelle Gittelman & Sarah Kaplan & John Lynch & Will Mitchell & Nicolaj Siggelkow & Chunmian Ge & Ke-Wei Huang & Ivan P. L. Png, 2016. "Engineer/scientist careers: Patents, online profiles, and misclassification bias," Strategic Management Journal, Wiley Blackwell, vol. 37(1), pages 232-253, January.
    6. Richard A. Bettis & Constance E. Helfat & J. Myles Shaver & Sendil K. Ethiraj & Alfonso Gambardella & Constance E. Helfat, 2016. "Replication in strategic management," Strategic Management Journal, Wiley Blackwell, vol. 37(11), pages 2191-2192, November.
    7. Kris Boudt & James Thewissen, 2019. "Jockeying for Position in CEO Letters: Impression Management and Sentiment Analytics," Financial Management, Financial Management Association International, vol. 48(1), pages 77-115, March.
    8. Adam J. Wowak & Michael J. Mannor & Mathias Arrfelt & Gerry McNamara, 2016. "Earthquake or glacier? How CEO charisma manifests in firm strategy over time," Strategic Management Journal, Wiley Blackwell, vol. 37(3), pages 586-603, March.
    9. Lorenz Graf-Vlachy & Jonathan Bundy & Donald C. Hambrick, 2020. "Effects of an Advancing Tenure on CEO Cognitive Complexity," Organization Science, INFORMS, vol. 31(4), pages 936-959, July.
    10. Pagan, Adrian, 1984. "Econometric Issues in the Analysis of Regressions with Generated Regressors," International Economic Review, Department of Economics, University of Pennsylvania and Osaka University Institute of Social and Economic Research Association, vol. 25(1), pages 221-247, February.
    11. Glenn Hoetker, 2007. "The use of logit and probit models in strategic management research: Critical issues," Strategic Management Journal, Wiley Blackwell, vol. 28(4), pages 331-343, April.
    12. Flickinger, Miriam & Wrage, Markus & Tuschke, Anja & Bresser, Rudi, 2016. "How CEOs protect themselves against dismissal: A social status perspective," Munich Reprints in Economics 43516, University of Munich, Department of Economics.
    13. Maria L. Goranova & Richard L. Priem & Hermann A. Ndofor & Cheryl A. Trahms, 2017. "Is there a “Dark Side” to Monitoring? Board and Shareholder Monitoring Effects on M&A Performance Extremeness," Strategic Management Journal, Wiley Blackwell, vol. 38(11), pages 2285-2297, November.
    14. Prithwiraj Choudhury & Ryan T. Allen & Michael G. Endres, 2021. "Machine learning for pattern discovery in management research," Strategic Management Journal, Wiley Blackwell, vol. 42(1), pages 30-57, January.
    15. Abbie G. Oliver & Ryan Krause & John R. Busenbark & Matias Kalm, 2018. "BS in the boardroom: Benevolent sexism and board chair orientations," Strategic Management Journal, Wiley Blackwell, vol. 39(1), pages 113-130, January.
    16. Prithwiraj Choudhury & Dan Wang & Natalie A. Carlson & Tarun Khanna, 2019. "Machine learning approaches to facial and text analysis: Discovering CEO oral communication styles," Strategic Management Journal, Wiley Blackwell, vol. 40(11), pages 1705-1732, November.
    17. Sucheta Nadkarni & Tianxu Chen & Jianhong Chen, 2016. "The clock is ticking! Executive temporal depth, industry velocity, and competitive aggressiveness," Strategic Management Journal, Wiley Blackwell, vol. 37(6), pages 1132-1153, June.
    18. Joseph S. Harrison & Gary R. Thurgood & Steven Boivie & Michael D. Pfarrer, 2019. "Measuring CEO personality: Developing, validating, and testing a linguistic tool," Strategic Management Journal, Wiley Blackwell, vol. 40(8), pages 1316-1330, August.
    19. Tim Loughran & Bill Mcdonald, 2011. "When Is a Liability Not a Liability? Textual Analysis, Dictionaries, and 10‐Ks," Journal of Finance, American Finance Association, vol. 66(1), pages 35-65, February.
    20. Jaeho Choi & Anoop Menon & Haris Tabakovic, 2021. "Using machine learning to revisit the diversification–performance relationship," Strategic Management Journal, Wiley Blackwell, vol. 42(9), pages 1632-1661, September.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Anup Banerjee & Mattias Nordqvist & Karin Hellerstedt, 2020. "The role of the board chair—A literature review and suggestions for future research," Corporate Governance: An International Review, Wiley Blackwell, vol. 28(6), pages 372-405, November.
    2. Joseph Raffiee & Daniel Fehder & Florenta Teodoridis, 2022. "Revealing the revealed preferences of public firm CEOs and top executives: A new database from credit card spending," Strategic Management Journal, Wiley Blackwell, vol. 43(10), pages 2042-2065, October.
    3. Milan Miric & Nan Jia & Kenneth G. Huang, 2023. "Using supervised machine learning for large‐scale classification in management research: The case for identifying artificial intelligence patents," Strategic Management Journal, Wiley Blackwell, vol. 44(2), pages 491-519, February.
    4. Arslan-Ayaydin, Özgür & Bishara, Norman & Thewissen, James & Torsin, Wouter, 2020. "Managerial career concerns and the content of corporate disclosures: An analysis of the tone of earnings press releases," International Review of Financial Analysis, Elsevier, vol. 72(C).
    5. Li, Weiwen & Lu, Yuan & Makino, Shige & Lau, Chung-Ming, 2017. "National power distance, status incongruence, and CEO dismissal," Journal of World Business, Elsevier, vol. 52(6), pages 809-818.
    6. Constance E. Helfat & Aseem Kaul & David J. Ketchen & Jay B. Barney & Olivier Chatain & Harbir Singh, 2023. "Renewing the resource‐based view: New contexts, new concepts, and new methods," Strategic Management Journal, Wiley Blackwell, vol. 44(6), pages 1357-1390, June.
    7. Lu, Yun & Ntim, Collins G. & Zhang, Qingjing & Li, Pingli, 2022. "Board of directors’ attributes and corporate outcomes: A systematic literature review and future research agenda," International Review of Financial Analysis, Elsevier, vol. 84(C).
    8. Fang Shuai, 2019. "Homophily Exclusion or Homophily Preference? The Influence of the Executive Identity of Nonexecutive Directors on the Focal Firm Executive Pay and Ordinary Employee Pay," Journal of Systems Science and Information, De Gruyter, vol. 7(6), pages 550-567, December.
    9. Hongjin Zhu & Yue Pan & Jiaping Qiu & Jinli Xiao, 2022. "Hometown Ties and Favoritism in Chinese Corporations: Evidence from CEO Dismissals and Corporate Social Responsibility," Journal of Business Ethics, Springer, vol. 176(2), pages 283-310, March.
    10. Qian Wang & Duowen Wu & Lina Yan, 2021. "Effect of positive tone in MD&A disclosure on capital structure adjustment speed: evidence from China," Accounting and Finance, Accounting and Finance Association of Australia and New Zealand, vol. 61(4), pages 5809-5845, December.
    11. Liu, Pu & Nguyen, Hazel T., 2020. "CEO characteristics and tone at the top inconsistency," Journal of Economics and Business, Elsevier, vol. 108(C).
    12. Joseph Raffiee, 2017. "Employee Mobility and Interfirm Relationship Transfer: Evidence from the Mobility and Client Attachments of United States Federal Lobbyists, 1998–2014," Strategic Management Journal, Wiley Blackwell, vol. 38(10), pages 2019-2040, October.
    13. Zhang, Yameng & Sharma, Piyush & Xu, Yekun & Zhan, Wu, 2021. "Challenges in internationalization of R&D teams: Impact of foreign technocrats in top management teams on firm innovations," Journal of Business Research, Elsevier, vol. 128(C), pages 728-741.
    14. Jeong, Nara & Kim, Nari & Arthurs, Jonathan D., 2021. "The CEO’s tenure life cycle, corporate social responsibility and the moderating role of the CEO’s political orientation," Journal of Business Research, Elsevier, vol. 137(C), pages 464-474.
    15. Özgür Arslan‐Ayaydin & James Thewissen & Wouter Torsin, 2021. "Disclosure tone management and labor unions," Journal of Business Finance & Accounting, Wiley Blackwell, vol. 48(1-2), pages 102-147, January.
    16. Yang, Guang & Huang, Ruixian & Shi, Yukun & Jia, Zhehao, 2021. "Does a CEO's private reputation impede corporate governance?," Economic Modelling, Elsevier, vol. 104(C).
    17. Steffen Nauhaus & Johannes Luger & Sebastian Raisch, 2021. "Strategic Decision Making in the Digital Age: Expert Sentiment and Corporate Capital Allocation," Journal of Management Studies, Wiley Blackwell, vol. 58(7), pages 1933-1961, November.
    18. Shuangyan Li & Guangrui Wang & Yongli Luo, 2022. "Tone of language, financial disclosure, and earnings management: a textual analysis of form 20-F," Financial Innovation, Springer;Southwestern University of Finance and Economics, vol. 8(1), pages 1-24, December.
    19. Majid Majzoubi & Eric Yanfei Zhao, 2023. "Going beyond optimal distinctiveness: Strategic positioning for gaining an audience composition premium," Strategic Management Journal, Wiley Blackwell, vol. 44(3), pages 737-777, March.
    20. Jia, Ming & Ruan, Hongfei & Zhang, Zhe, 2017. "How rumors fly," Journal of Business Research, Elsevier, vol. 72(C), pages 33-45.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bla:stratm:v:44:y:2023:i:7:p:1780-1802. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Wiley Content Delivery (email available below). General contact details of provider: http://onlinelibrary.wiley.com/journal/10.1111/0143-2095 .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.