IDEAS home Printed from https://ideas.repec.org/a/spr/compst/v36y2021i3d10.1007_s00180-020-00992-2.html
   My bibliography  Save this article

Binary segmentation procedures using the bivariate binomial distribution for detecting streakiness in sports data

Author

Listed:
  • Seong W. Kim

    (Hanyang University)

  • Sabina Shahin

    (Karakoram International University)

  • Hon Keung Tony Ng

    (Southern Methodist University)

  • Jinheum Kim

    (University of Suwon)

Abstract

Streakiness is an important measure in many sports data for individual players or teams in which the success rate is not a constant over time. That is, there are many successes/failures during some periods and few or no successes/failures during other periods. In this paper we propose a Bayesian binary segmentation procedure using a bivariate binomial distribution to locate the changepoints and estimate the associated success rates. The proposed method consists of a series of nested hypothesis tests based on the Bayes factors or posterior probabilities. At each stage, we compare three different changepoint models to the constant success rate model using the bivariate binary data. The proposed method is applied to analyze real sports datasets on baseball and basketball players as illustration. Extensive simulation studies are performed to demonstrate the usefulness of the proposed methodologies.

Suggested Citation

  • Seong W. Kim & Sabina Shahin & Hon Keung Tony Ng & Jinheum Kim, 2021. "Binary segmentation procedures using the bivariate binomial distribution for detecting streakiness in sports data," Computational Statistics, Springer, vol. 36(3), pages 1821-1843, September.
  • Handle: RePEc:spr:compst:v:36:y:2021:i:3:d:10.1007_s00180-020-00992-2
    DOI: 10.1007/s00180-020-00992-2
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s00180-020-00992-2
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s00180-020-00992-2?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. D. A. Stephens, 1994. "Bayesian Retrospective Multiple‐Changepoint Identification," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 43(1), pages 159-178, March.
    2. Dorsey-Palmateer R. & Smith G., 2004. "Bowlers Hot Hands," The American Statistician, American Statistical Association, vol. 58, pages 38-45, February.
    3. Piette James & Anand Sathyanarayan & Zhang Kai, 2010. "Scoring and Shooting Abilities of NBA Players," Journal of Quantitative Analysis in Sports, De Gruyter, vol. 6(1), pages 1-25, January.
    4. Null Brad, 2009. "Modeling Baseball Player Ability with a Nested Dirichlet Distribution," Journal of Quantitative Analysis in Sports, De Gruyter, vol. 5(2), pages 1-38, May.
    5. Albert Jim, 2008. "Streaky Hitting in Baseball," Journal of Quantitative Analysis in Sports, De Gruyter, vol. 4(1), pages 1-34, January.
    6. Paul Fearnhead & Zhen Liu, 2007. "On‐line inference for multiple changepoint problems," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 69(4), pages 589-605, September.
    7. Baumer Ben S, 2008. "Why On-Base Percentage is a Better Indicator of Future Performance than Batting Average: An Algebraic Proof," Journal of Quantitative Analysis in Sports, De Gruyter, vol. 4(2), pages 1-13, April.
    8. Tae Young Yang, 2004. "Bayesian binary segmentation procedure for detecting streakiness in sports," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 167(4), pages 627-637, November.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. McShane Blakeley B. & Braunstein Alexander & Piette James & Jensen Shane T., 2011. "A Hierarchical Bayesian Variable Selection Approach to Major League Baseball Hitting Metrics," Journal of Quantitative Analysis in Sports, De Gruyter, vol. 7(4), pages 1-26, October.
    2. Albert Jim, 2013. "Looking at spacings to assess streakiness," Journal of Quantitative Analysis in Sports, De Gruyter, vol. 9(2), pages 151-163, June.
    3. Santos-Fernandez Edgar & Wu Paul & Mengersen Kerrie L., 2019. "Bayesian statistics meets sports: a comprehensive review," Journal of Quantitative Analysis in Sports, De Gruyter, vol. 15(4), pages 289-312, December.
    4. Gerber Eric A. E. & Craig Bruce A., 2021. "A mixed effects multinomial logistic-normal model for forecasting baseball performance," Journal of Quantitative Analysis in Sports, De Gruyter, vol. 17(3), pages 221-239, September.
    5. Faicel Chamroukhi, 2016. "Piecewise Regression Mixture for Simultaneous Functional Data Clustering and Optimal Segmentation," Journal of Classification, Springer;The Classification Society, vol. 33(3), pages 374-411, October.
    6. Gil Aharoni & Oded H. Sarig, 2012. "Hot hands and equilibrium," Applied Economics, Taylor & Francis Journals, vol. 44(18), pages 2309-2320, June.
    7. Ruggieri, Eric & Antonellis, Marcus, 2016. "An exact approach to Bayesian sequential change point detection," Computational Statistics & Data Analysis, Elsevier, vol. 97(C), pages 71-86.
    8. Scott D. Grimshaw & Jeffrey S. Larson, 2021. "Effect of Star Power on NBA All-Star Game TV Audience," Journal of Sports Economics, , vol. 22(2), pages 139-163, February.
    9. Roger K. Loh & Mitch Warachka, 2012. "Streaks in Earnings Surprises and the Cross-Section of Stock Returns," Management Science, INFORMS, vol. 58(7), pages 1305-1321, July.
    10. Yukio Ohsawa, 2018. "Graph-Based Entropy for Detecting Explanatory Signs of Changes in Market," The Review of Socionetwork Strategies, Springer, vol. 12(2), pages 183-203, December.
    11. Piette James & Jensen Shane T., 2012. "Estimating Fielding Ability in Baseball Players Over Time," Journal of Quantitative Analysis in Sports, De Gruyter, vol. 8(3), pages 1-36, October.
    12. Ardia, David & Dufays, Arnaud & Ordás Criado, Carlos, 2023. "Linking Frequentist and Bayesian Change-Point Methods," MPRA Paper 119486, University Library of Munich, Germany.
    13. Lu Shaochuan, 2020. "Bayesian multiple changepoints detection for Markov jump processes," Computational Statistics, Springer, vol. 35(3), pages 1501-1523, September.
    14. Michael D. Lee, 2018. "Bayesian methods for analyzing true-and-error models," Judgment and Decision Making, Society for Judgment and Decision Making, vol. 13(6), pages 622-635, November.
    15. Chen, Yudong & Wang, Tengyao & Samworth, Richard J., 2022. "High-dimensional, multiscale online changepoint detection," LSE Research Online Documents on Economics 113665, London School of Economics and Political Science, LSE Library.
    16. Christophe Andrieu & Arnaud Doucet & Roman Holenstein, 2010. "Particle Markov chain Monte Carlo methods," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 72(3), pages 269-342, June.
    17. Tian, Guo-Liang & Ng, Kai Wang & Li, Kai-Can & Tan, Ming, 2009. "Non-iterative sampling-based Bayesian methods for identifying changepoints in the sequence of cases of Haemolytic uraemic syndrome," Computational Statistics & Data Analysis, Elsevier, vol. 53(9), pages 3314-3323, July.
    18. Weinstein-Gould Jesse, 2009. "Keeping the Hitter Off Balance: Mixed Strategies in Baseball," Journal of Quantitative Analysis in Sports, De Gruyter, vol. 5(2), pages 1-20, May.
    19. Inder Tecuapetla-Gómez & Axel Munk, 2017. "Autocovariance Estimation in Regression with a Discontinuous Signal and m-Dependent Errors: A Difference-Based Approach," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 44(2), pages 346-368, June.
    20. Ricardo C. Pedroso & Rosangela H. Loschi & Fernando Andrés Quintana, 2023. "Multipartition model for multiple change point identification," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 32(2), pages 759-783, June.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:compst:v:36:y:2021:i:3:d:10.1007_s00180-020-00992-2. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.