IDEAS home Printed from https://ideas.repec.org/a/sae/sagope/v11y2021i4p21582440211061566.html
   My bibliography  Save this article

A Multidisciplinary Perspective on Publicly Available Sports Data in the Era of Big Data: A Scoping Review of the Literature on Major League Baseball

Author

Listed:
  • Jyh-How Huang
  • Yu-Chia Hsu

Abstract

Sports big data has been an emerging research area in recent years. The purpose of this study was to ascertain the most frequent research topics, application areas, data sources, and data usage characteristics in the existing literature, in order to understand the development of data-driven baseball research and the multidisciplinary participation in the big data era. A scoping review was conducted, focusing on the diversity of using publicly available major league baseball data. Next, the co-occurrence analysis in bibliometrics was used to present a knowledge map of the reviewed literature. Finally, we propose a comprehensive baseball data research domain framework to visualize the ecosystem of publicly available sports data applications mapped to the four application domains in the big data maturity model. After searching and screening process from the Web of Science, Science Direct, and SPORTDiscus database, 48 relevant papers with clearly indicated data sources and data fields used were finally selected and full reviewed for advanced analysis. The most relevant research hotspots for sports data are sequentially economics and finance, sports injury, and sports performance evaluation. Subjects studied ranged from pitchers, position players, catchers, umpires, batters, free agents, and attendees. The most popular data sources are PITCHf/x, the Lahman Baseball Database, and baseball-reference.com. This review can serve as a valuable starting point for researchers to plan research strategies, to discover opportunities for cross-disciplinary research innovations, and to categorize their work in the context of the state of research.

Suggested Citation

  • Jyh-How Huang & Yu-Chia Hsu, 2021. "A Multidisciplinary Perspective on Publicly Available Sports Data in the Era of Big Data: A Scoping Review of the Literature on Major League Baseball," SAGE Open, , vol. 11(4), pages 21582440211, November.
  • Handle: RePEc:sae:sagope:v:11:y:2021:i:4:p:21582440211061566
    DOI: 10.1177/21582440211061566
    as

    Download full text from publisher

    File URL: https://journals.sagepub.com/doi/10.1177/21582440211061566
    Download Restriction: no

    File URL: https://libkey.io/10.1177/21582440211061566?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Jyhhow Huang & Hwai-Jung Hsu, 2020. "Approximating strike zone size and shape for baseball umpires under different conditions," International Journal of Performance Analysis in Sport, Taylor & Francis Journals, vol. 20(2), pages 133-149, March.
    2. Chih-Cheng Chen & Yung-Tan Lee & Chung-Ming Tsai, 2014. "Professional Baseball Team Starting Pitcher Selection Using AHP and TOPSIS Methods," International Journal of Performance Analysis in Sport, Taylor & Francis Journals, vol. 14(2), pages 545-563, August.
    3. Giovanni Abramo & Ciriaco Andrea D’Angelo & Flavia Costa, 2018. "The effect of multidisciplinary collaborations on research diversification," Scientometrics, Springer;Akadémiai Kiadó, vol. 116(1), pages 423-433, July.
    4. Bodvarsson, Őrn B. & Papps, Kerry L. & Sessions, John G., 2014. "Cross-assignment discrimination in pay: A test case of major league baseball," Labour Economics, Elsevier, vol. 28(C), pages 84-95.
    5. Ana Viseu, 2015. "Integration of social science into research is crucial," Nature, Nature, vol. 525(7569), pages 291-291, September.
    6. Jim Downey & Joseph McGarrity, 2019. "Pressure and the ability to randomize decision-making: The case of the pickoff play in Major League Baseball," Atlantic Economic Journal, Springer;International Atlantic Economic Society, vol. 47(3), pages 261-274, September.
    7. Mills, Brian M. & Salaga, Steven, 2018. "A natural experiment for efficient markets: Information quality and influential agents," Journal of Financial Markets, Elsevier, vol. 40(C), pages 23-39.
    8. Nees Jan Eck & Ludo Waltman, 2010. "Software survey: VOSviewer, a computer program for bibliometric mapping," Scientometrics, Springer;Akadémiai Kiadó, vol. 84(2), pages 523-538, August.
    9. David Moher & Alessandro Liberati & Jennifer Tetzlaff & Douglas G Altman & The PRISMA Group, 2009. "Preferred Reporting Items for Systematic Reviews and Meta-Analyses: The PRISMA Statement," PLOS Medicine, Public Library of Science, vol. 6(7), pages 1-6, July.
    10. Deshpande Sameer K. & Wyner Abraham, 2017. "A hierarchical Bayesian model of pitch framing," Journal of Quantitative Analysis in Sports, De Gruyter, vol. 13(3), pages 95-112, September.
    11. Lewis, Herbert F. & Mallikarjun, Sreekanth & Sexton, Thomas R., 2013. "Unoriented two-stage DEA: The case of the oscillating intermediate products," European Journal of Operational Research, Elsevier, vol. 229(2), pages 529-539.
    12. DepkenII, Craig A., 2000. "Wage disparity and team productivity: evidence from major league baseball," Economics Letters, Elsevier, vol. 67(1), pages 87-92, April.
    13. Kappe, Eelco & Stadler Blank, Ashley & DeSarbo, Wayne S., 2018. "A random coefficients mixture hidden Markov model for marketing research," International Journal of Research in Marketing, Elsevier, vol. 35(3), pages 415-431.
    14. Fan, Qingliang & Wang, Ting, 2018. "Game day effect on stock market: Evidence from four major sports leagues in US," Journal of Behavioral and Experimental Finance, Elsevier, vol. 20(C), pages 9-18.
    15. Garcia, Stephen M. & Arora, Poonam & Reese, Zachary A. & Shain, Michael J., 2020. "Free agency and organizational rankings: A social comparison perspective on signaling theory," Journal of Behavioral and Experimental Economics (formerly The Journal of Socio-Economics), Elsevier, vol. 89(C).
    16. Devansh Patel & Dhwanil Shah & Manan Shah, 2020. "The Intertwine of Brain and Body: A Quantitative Analysis on How Big Data Influences the System of Sports," Annals of Data Science, Springer, vol. 7(1), pages 1-16, March.
    17. César Soto-Valero & Mabel González-Castellanos & Irvin Pérez-Morales, 2017. "A predictive model for analysing the starting pitchers’ performance using time series classification methods," International Journal of Performance Analysis in Sport, Taylor & Francis Journals, vol. 17(4), pages 492-509, July.
    18. Baumer Benjamin S. & Jensen Shane T. & Matthews Gregory J., 2015. "openWAR: An open source system for evaluating overall player performance in major league baseball," Journal of Quantitative Analysis in Sports, De Gruyter, vol. 11(2), pages 69-84, June.
    19. Lawrence M. Kahn, 2000. "The Sports Business as a Labor Market Laboratory," Journal of Economic Perspectives, American Economic Association, vol. 14(3), pages 75-94, Summer.
    20. Bradbury, John Charles, 2017. "Monopsony and competition: The impact of rival leagues on player salaries during the early days of baseball," Explorations in Economic History, Elsevier, vol. 65(C), pages 55-67.
    21. Vock David Michael & Vock Laura Frances Boehm, 2018. "Estimating the effect of plate discipline using a causal inference framework: an application of the G-computation algorithm," Journal of Quantitative Analysis in Sports, De Gruyter, vol. 14(2), pages 37-56, June.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Carlo Dindorf & Eva Bartaguiz & Freya Gassmann & Michael Fröhlich, 2022. "Conceptual Structure and Current Trends in Artificial Intelligence, Machine Learning, and Deep Learning Research in Sports: A Bibliometric Review," IJERPH, MDPI, vol. 20(1), pages 1-23, December.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Formolli, M. & Kleiven, T. & Lobaccaro, G., 2023. "Assessing solar energy accessibility at high latitudes: A systematic review of urban spatial domains, metrics, and parameters," Renewable and Sustainable Energy Reviews, Elsevier, vol. 177(C).
    2. Egon Franck & Stephan Nüesch, 2007. "Wage Dispersion and Team Performance - An Empirical Panel Analysis," Working Papers 0017, University of Zurich, Center for Research in Sports Administration (CRSA).
    3. Carlo Bellavite Pellegrini & Raul Caruso & Marco Di Domizio, 2021. "Relative wages, payroll structure and performance in soccer. Evidence from Italian Serie A (2007-2019)," DISCE - Quaderni del Dipartimento di Politica Economica dipe0015, Università Cattolica del Sacro Cuore, Dipartimenti e Istituti di Scienze Economiche (DISCE).
    4. Nirojan JASINTHA, 2023. "What Is Known And Unknown: A Bibliometric Analysis Of Organizational Politics," Management Research and Practice, Research Centre in Public Administration and Public Services, Bucharest, Romania, vol. 15(2), pages 5-16, June.
    5. Agnieszka Konys, 2019. "Green Supplier Selection Criteria: From a Literature Review to a Comprehensive Knowledge Base," Sustainability, MDPI, vol. 11(15), pages 1-41, August.
    6. Emilio Rossi & Erminia Attaianese, 2023. "Research Synergies between Sustainability and Human-Centered Design: A Systematic Literature Review," Sustainability, MDPI, vol. 15(17), pages 1-19, August.
    7. Habib Sadri & Ibrahim Yitmen & Lavinia Chiara Tagliabue & Florian Westphal & Algan Tezel & Afshin Taheri & Goran Sibenik, 2023. "Integration of Blockchain and Digital Twins in the Smart Built Environment Adopting Disruptive Technologies—A Systematic Review," Sustainability, MDPI, vol. 15(4), pages 1-46, February.
    8. Bar-Eli, Michael & Krumer, Alex & Morgulev, Elia, 2020. "Ask not what economics can do for sports - Ask what sports can do for economics," Journal of Behavioral and Experimental Economics (formerly The Journal of Socio-Economics), Elsevier, vol. 89(C).
    9. Dorsa Alipour & Hussein Dia, 2023. "A Systematic Review of the Role of Land Use, Transport, and Energy-Environment Integration in Shaping Sustainable Cities," Sustainability, MDPI, vol. 15(8), pages 1-29, April.
    10. Małgorzata Krzywonos & Zdzisława Romanowska-Duda & Przemysław Seruga & Beata Messyasz & Stanisław Mec, 2023. "The Use of Plants from the Lemnaceae Family for Biofuel Production—A Bibliometric and In-Depth Content Analysis," Energies, MDPI, vol. 16(4), pages 1-24, February.
    11. Alessandro Bucciol & Nicolai J Foss & Marco Piovesan, 2014. "Pay Dispersion and Performance in Teams," PLOS ONE, Public Library of Science, vol. 9(11), pages 1-16, November.
    12. Agnieszka Konys, 2019. "Towards Sustainable Entrepreneurship Holistic Construct," Sustainability, MDPI, vol. 11(23), pages 1-33, November.
    13. Karel Janda & Eva Michalikova & Luiz Célio Souza Rocha & Paulo Rotella Junior & Barbora Schererova & David Zilberman, 2022. "Review of the Impact of Biofuels on U.S. Retail Gasoline Prices," Energies, MDPI, vol. 16(1), pages 1-21, December.
    14. Cheolbeom Park, 2023. "Optimal salary inequality for team performance: evidence from National Football League data," Applied Economics, Taylor & Francis Journals, vol. 55(24), pages 2773-2787, May.
    15. Qian Ma & Yandan Li & Yan Zhang, 2020. "Informetric Analysis of Highly Cited Papers in Environmental Sciences Based on Essential Science Indicators," IJERPH, MDPI, vol. 17(11), pages 1-14, May.
    16. Marco Di Domizio & Carlo Bellavite Pellegrini & Raul Caruso, 2022. "Payroll dispersion and performance in soccer: A seasonal perspective analysis for Italian Serie A (2007–2021)," Contemporary Economic Policy, Western Economic Association International, vol. 40(3), pages 513-525, July.
    17. Huixian Shen & Ivan Ka Wai Lai, 2022. "Souvenirs: A Systematic Literature Review (1981–2020) and Research Agenda," SAGE Open, , vol. 12(2), pages 21582440221, June.
    18. Mathew Azarian & Hao Yu & Asmamaw Tadege Shiferaw & Tor Kristian Stevik, 2023. "Do We Perform Systematic Literature Review Right? A Scientific Mapping and Methodological Assessment," Logistics, MDPI, vol. 7(4), pages 1-32, November.
    19. Thomas Peeters & Steven Salaga & Matthew Juravich, 2015. "Matching and Winning? The Impact of Upper and Middle Managers on Team Performance in Major League Baseball," Tinbergen Institute Discussion Papers 15-115/VII, Tinbergen Institute, revised 03 Mar 2020.
    20. Vítor João Pereira Domingues Martinho, 2022. "Impacts of the COVID-19 Pandemic and the Russia–Ukraine Conflict on Land Use across the World," Land, MDPI, vol. 11(10), pages 1-14, September.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:sae:sagope:v:11:y:2021:i:4:p:21582440211061566. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: SAGE Publications (email available below). General contact details of provider: .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.