IDEAS home Printed from
   My bibliography  Save this article

Simply Better: Using Regression Models to Estimate Major League Batting Averages


  • Neal Dan

    (University of Florida)

  • Tan James

    (Buchholz High School)

  • Hao Feng

    (Tianjin University of Finance and Economics)

  • Wu Samuel S

    (University of Florida)


We consider the problem of estimating a Major League Baseball players batting average in the second half of a season based on his performance in the first half. We fit two linear regression models to players averages from each half of the 2004 season, use these models to predict batting averages in the latter half of 2005 and compare the results to those achieved by three Bayesian estimators considered by Brown (2008). The linear models consistently outperform the Bayesian estimators in terms of four measures of error. Since the regression models use data from 2004 as well as 2005, while Browns estimators were based strictly on 2005 data, we also compare the performance of the linear models to that of the Bayesian estimators when the Bayesian estimators are based on the same amount of data. We find the linear models to be superior in this case as well. As a further test, we use the same methods to predict on-base percentages in the last half of the 2005 season, and we find that the linear models again do a better job. While we change the question proposed in Browns original paper, our results are a valuable reminder of the power of linear regression.

Suggested Citation

  • Neal Dan & Tan James & Hao Feng & Wu Samuel S, 2010. "Simply Better: Using Regression Models to Estimate Major League Batting Averages," Journal of Quantitative Analysis in Sports, De Gruyter, vol. 6(3), pages 1-14, July.
  • Handle: RePEc:bpj:jqsprt:v:6:y:2010:i:3:n:12

    Download full text from publisher

    File URL:
    Download Restriction: For access to full text, subscription to the journal or payment for the individual article is required.

    As the access to this document is restricted, you may want to search for a different version of it.


    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.

    Cited by:

    1. Albert Jim, 2016. "Improved component predictions of batting and pitching measures," Journal of Quantitative Analysis in Sports, De Gruyter, vol. 12(2), pages 73-85, June.

    More about this item


    Access and download statistics


    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bpj:jqsprt:v:6:y:2010:i:3:n:12. See general information about how to correct material in RePEc.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: (Peter Golla). General contact details of provider: .

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service hosted by the Research Division of the Federal Reserve Bank of St. Louis . RePEc uses bibliographic data supplied by the respective publishers.