Boosted regression (boosting): An introductory tutorial and a Stata plugin
Boosting, or boosted regression, is a recent data-mining technique that has shown considerable success in predictive accuracy. This article gives an overview of boosting and introduces a new Stata command, boost, that im- plements the boosting algorithm described in Hastie, Tibshirani, and Friedman (2001, 322). The plugin is illustrated with a Gaussian and a logistic regression example. In the Gaussian regression example, the R2 value computed on a test dataset is R2 = 21.3% for linear regression and R2 = 93.8% for boosting. In the logistic regression example, stepwise logistic regression correctly classifies 54.1% of the observations in a test dataset versus 76.0% for boosted logistic regression. Currently, boost accommodates Gaussian (normal), logistic, and Poisson boosted regression. boost is implemented as a Windows C++ plugin. Copyright 2005 by StataCorp LP.
Volume (Year): 5 (2005)
Issue (Month): 3 (September)
|Contact details of provider:|| Web page: http://www.stata-journal.com/|
|Order Information:||Web: http://www.stata-journal.com/subscription.html|
When requesting a correction, please mention this item's handle: RePEc:tsj:stataj:v:5:y:2005:i:3:p:330-354. See general information about how to correct material in RePEc.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: (Christopher F. Baum)or (Lisa Gilmore)
If references are entirely missing, you can add them using this form.