IDEAS home Printed from https://ideas.repec.org/a/eee/csdana/v54y2010i10p2203-2213.html
   My bibliography  Save this article

Early stopping in L2Boosting

Author

Listed:
  • Ivan Chang, Yuan-Chin
  • Huang, Yufen
  • Huang, Yu-Pai

Abstract

It is well known that the boosting-like algorithms, such as AdaBoost and many of its modifications, may over-fit the training data when the number of boosting iterations becomes large. Therefore, how to stop a boosting algorithm at an appropriate iteration time is a longstanding problem for the past decade (see Meir and Rätsch, 2003). Bühlmann and Yu (2005) applied model selection criteria to estimate the stopping iteration for L2Boosting, but it is still necessary to compute all boosting iterations under consideration for the training data. Thus, the main purpose of this paper is focused on studying the early stopping rule for L2Boosting during the training stage to seek a very substantial computational saving. The proposed method is based on a change point detection method on the values of model selection criteria during the training stage. This method is also extended to two-class classification problems which are very common in medical and bioinformatics applications. A simulation study and a real data example to these approaches are provided for illustrations, and comparisons are made with LogitBoost.

Suggested Citation

  • Ivan Chang, Yuan-Chin & Huang, Yufen & Huang, Yu-Pai, 2010. "Early stopping in L2Boosting," Computational Statistics & Data Analysis, Elsevier, vol. 54(10), pages 2203-2213, October.
  • Handle: RePEc:eee:csdana:v:54:y:2010:i:10:p:2203-2213
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0167-9473(10)00126-X
    Download Restriction: Full text for ScienceDirect subscribers only.
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Hansen M. H & Yu B., 2001. "Model Selection and the Principle of Minimum Description Length," Journal of the American Statistical Association, American Statistical Association, vol. 96, pages 746-774, June.
    2. T. Speed & Bin Yu, 1993. "Model selection and prediction: Normal regression," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 45(1), pages 35-54, March.
    3. Buhlmann P. & Yu B., 2003. "Boosting With the L2 Loss: Regression and Classification," Journal of the American Statistical Association, American Statistical Association, vol. 98, pages 324-339, January.
    4. Tsao, C. Andy & Chang, Yuan-chin Ivan, 2007. "A stochastic approximation view of boosting," Computational Statistics & Data Analysis, Elsevier, vol. 52(1), pages 325-334, September.
    5. Clifford M. Hurvich & Jeffrey S. Simonoff & Chih‐Ling Tsai, 1998. "Smoothing parameter selection in nonparametric regression using an improved Akaike information criterion," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 60(2), pages 271-293.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Jing Zeng, 2014. "Forecasting Aggregates with Disaggregate Variables: Does Boosting Help to Select the Most Relevant Predictors?," Working Paper Series of the Department of Economics, University of Konstanz 2014-20, Department of Economics, University of Konstanz.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Christian Pierdzioch & Rangan Gupta & Hossein Hassani & Emmanuel Silva, 2018. "Forecasting Changes of Economic Inequality: A Boosting Approach," Working Papers 201868, University of Pretoria, Department of Economics.
    2. Leitenstorfer, Florian & Tutz, Gerhard, 2007. "Knot selection by boosting techniques," Computational Statistics & Data Analysis, Elsevier, vol. 51(9), pages 4605-4621, May.
    3. Klaus Wohlrabe & Teresa Buchen, 2014. "Assessing the Macroeconomic Forecasting Performance of Boosting: Evidence for the United States, the Euro Area and Germany," Journal of Forecasting, John Wiley & Sons, Ltd., vol. 33(4), pages 231-242, July.
    4. Tutz, Gerhard & Leitenstorfer, Florian, 2006. "Response shrinkage estimators in binary regression," Computational Statistics & Data Analysis, Elsevier, vol. 50(10), pages 2878-2901, June.
    5. Ng, Serena, 2013. "Variable Selection in Predictive Regressions," Handbook of Economic Forecasting, in: G. Elliott & C. Granger & A. Timmermann (ed.), Handbook of Economic Forecasting, edition 1, volume 2, chapter 0, pages 752-789, Elsevier.
    6. Schmid, Matthias & Hothorn, Torsten, 2008. "Boosting additive models using component-wise P-Splines," Computational Statistics & Data Analysis, Elsevier, vol. 53(2), pages 298-311, December.
    7. Ching-Kang Ing, 2005. "Accumulated Prediction Errors, Information Criteria And Optimal Forecasting For Autoregressive Time Series," Econometrics 0503020, University Library of Munich, Germany.
    8. Jing Zeng, 2014. "Forecasting Aggregates with Disaggregate Variables: Does Boosting Help to Select the Most Relevant Predictors?," Working Paper Series of the Department of Economics, University of Konstanz 2014-20, Department of Economics, University of Konstanz.
    9. Daye, Z. John & Jeng, X. Jessie, 2009. "Shrinkage and model selection with correlated variables via weighted fusion," Computational Statistics & Data Analysis, Elsevier, vol. 53(4), pages 1284-1298, February.
    10. Tutz, Gerhard & Pößnecker, Wolfgang & Uhlmann, Lorenz, 2015. "Variable selection in general multinomial logit models," Computational Statistics & Data Analysis, Elsevier, vol. 82(C), pages 207-222.
    11. Gerhard Tutz & Moritz Berger, 2018. "Tree-structured modelling of categorical predictors in generalized additive regression," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 12(3), pages 737-758, September.
    12. Hans R. A. Koster & Jos N. van Ommeren & Piet Rietveld, 2016. "Historic amenities, income and sorting of households," Journal of Economic Geography, Oxford University Press, vol. 16(1), pages 203-236.
    13. Bethany Everett & David Rehkopf & Richard Rogers, 2013. "The Nonlinear Relationship Between Education and Mortality: An Examination of Cohort, Race/Ethnic, and Gender Differences," Population Research and Policy Review, Springer;Southern Demographic Association (SDA), vol. 32(6), pages 893-917, December.
    14. Shuichi Kawano, 2014. "Selection of tuning parameters in bridge regression models via Bayesian information criterion," Statistical Papers, Springer, vol. 55(4), pages 1207-1223, November.
    15. Tsimpanos, Apostolos & Tsimbos, Cleon & Kalogirou, Stamatis, 2018. "Assessing spatial variation and heterogeneity of fertility in Greece at local authority level," MPRA Paper 100406, University Library of Munich, Germany.
    16. Mittnik, Stefan & Robinzonov, Nikolay & Spindler, Martin, 2015. "Stock market volatility: Identifying major drivers and the nature of their impact," Journal of Banking & Finance, Elsevier, vol. 58(C), pages 1-14.
    17. Don Harding, 2010. "Applying shape and phase restrictions in generalized dynamic categorical models of the business cycle," NCER Working Paper Series 58, National Centre for Econometric Research.
    18. Michael S. Delgado & Daniel J. Henderson & Christopher F. Parmeter, 2014. "Does Education Matter for Economic Growth?," Oxford Bulletin of Economics and Statistics, Department of Economics, University of Oxford, vol. 76(3), pages 334-359, June.
    19. Seongkyoon Jeong & Jae Young Choi, 2012. "The taxonomy of research collaboration in science and technology: evidence from mechanical research through probabilistic clustering analysis," Scientometrics, Springer;Akadémiai Kiadó, vol. 91(3), pages 719-735, June.
    20. Suneel Babu Chatla, 2023. "Nonparametric inference for additive models estimated via simplified smooth backfitting," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 75(1), pages 71-97, February.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:csdana:v:54:y:2010:i:10:p:2203-2213. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/csda .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.