IDEAS home Printed from https://ideas.repec.org/p/ucr/wpaper/201907.html
   My bibliography  Save this paper

Component-wise AdaBoost Algorithms for High-dimensional Binary Classi fication and Class Probability Prediction

Author

Listed:
  • Tae-Hwy Lee

    () (Department of Economics, University of California Riverside)

  • Jianghao Chu

    () (UCR)

  • Aman Ullah

    () (UCR)

Abstract

Freund and Schapire (1997) introduced "Discrete AdaBoost" (DAB) which has been mysteriously effective for the high-dimensional binary classi cation or binary prediction. In an effort to understand the myth, Friedman, Hastie and Tibshirani (FHT, 2000) show that DAB can be understood as statistical learning which builds an additive logistic regression model via Newton-like updating minimization of the exponential loss. From this statistical point of view, FHT proposed three modi fications of DAB, namely, Real AdaBoost (RAB), LogitBoost (LB), and Gentle AdaBoost (GAB). All of DAB, RAB, LB, GAB solve for the logistic regression via different algorithmic designs and different objective functions. The RAB algorithm uses class probability estimates to construct real-valued contributions of the weak learner, LB is an adaptive Newton algorithm by stagewise optimization of the Bernoulli likelihood, and GAB is an adaptive Newton algorithm via stagewise optimization of the exponential loss. The same authors of FHT published an influential textbook, The Elements of Statistical Learn- ing (ESL, 2001 and 2008). A companion book An Introduction to Statistical Learning (ISL) by James et al. (2013) was published with applications in R. However, both ESL and ISL (e.g., sections 4.5 and 4.6) do not cover these four AdaBoost algorithms while FHT provided some simulation and empirical studies to compare these methods. Given numerous potential applications, we believe it would be useful to collect the R libraries of these AdaBoost algorithms, as well as more recently developed extensions to Ad- aBoost for probability prediction with examples and illustrations. Therefore, the goal of this chapter is to do just that, i.e., (i) to provide a user guide of these alternative AdaBoost algorithms with step-by-step tutorial of using R (in a way similar to ISL, e.g., Section 4.6), (ii) to compare AdaBoost with alternative machine learning classi fication tools such as the deep neural network (DNN), logistic regression with LASSO and SIM-RODEO, and (iii) to demonstrate the empirical applications in economics, such as prediction of business cycle turning points and directional prediction of stock price indexes. We revisit Ng (2014) who used DAB for prediction of the business cycle turning points by comparing the results from RAB, LB, GAB, DNN, logistic regression and SIM-RODEO.

Suggested Citation

  • Tae-Hwy Lee & Jianghao Chu & Aman Ullah, 2018. "Component-wise AdaBoost Algorithms for High-dimensional Binary Classi fication and Class Probability Prediction," Working Papers 201907, University of California at Riverside, Department of Economics.
  • Handle: RePEc:ucr:wpaper:201907
    as

    Download full text from publisher

    File URL: https://economics.ucr.edu/repec/ucr/wpaper/201907.pdf
    File Function: First version, 2018
    Download Restriction: no

    More about this item

    Keywords

    AdaBoost; R; Binary classi cation; Logistic regression; DAB; RAB; LB; GAB; DNN;

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:ucr:wpaper:201907. See general information about how to correct material in RePEc.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: (Kelvin Mac). General contact details of provider: http://edirc.repec.org/data/deucrus.html .

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service hosted by the Research Division of the Federal Reserve Bank of St. Louis . RePEc uses bibliographic data supplied by the respective publishers.