IDEAS home Printed from https://ideas.repec.org/a/spr/compst/v31y2016i2d10.1007_s00180-015-0642-2.html
   My bibliography  Save this article

Boosting in Cox regression: a comparison between the likelihood-based and the model-based approaches with focus on the R-packages CoxBoost and mboost

Author

Listed:
  • Riccardo De Bin

    (University of Munich)

Abstract

Despite the limitations imposed by the proportional hazards assumption, the Cox model is probably the most popular statistical tool used to analyze survival data, thanks to its flexibility and ease of interpretation. For this reason, novel statistical/machine learning techniques are usually adapted to fit its requirements, including boosting. Boosting is an iterative technique originally developed in the machine learning community to handle classification problems, and later extended to the statistical field, where it is used in many situations, including regression and survival analysis. The popularity of boosting has been further driven by the availability of user-friendly software such as the R packages mboost and CoxBoost, both of which allow the implementation of boosting in conjunction with the Cox model. Despite the common underlying boosting principles, these two packages use different techniques: the former is an adaptation of model-based boosting, while the latter adapts likelihood-based boosting. Here we contrast these two boosting techniques as implemented in the R packages from an analytic point of view; we further examine solutions adopted within these packages to treat mandatory variables, i.e. variables that—for several reasons—must be included in the model. We explore the possibility of extending solutions currently only implemented in one package to the other. A simulation study and a real data example are added for illustration.

Suggested Citation

  • Riccardo De Bin, 2016. "Boosting in Cox regression: a comparison between the likelihood-based and the model-based approaches with focus on the R-packages CoxBoost and mboost," Computational Statistics, Springer, vol. 31(2), pages 513-531, June.
  • Handle: RePEc:spr:compst:v:31:y:2016:i:2:d:10.1007_s00180-015-0642-2
    DOI: 10.1007/s00180-015-0642-2
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s00180-015-0642-2
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s00180-015-0642-2?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Benjamin Hofner & Andreas Mayr & Nikolay Robinzonov & Matthias Schmid, 2014. "Model-based boosting in R: a hands-on tutorial using the R package mboost," Computational Statistics, Springer, vol. 29(1), pages 3-35, February.
    2. Benjamin Hofner & Torsten Hothorn & Thomas Kneib, 2013. "Variable selection and model choice in structured survival models," Computational Statistics, Springer, vol. 28(3), pages 1079-1101, June.
    3. Gerhard Tutz & Harald Binder, 2006. "Generalized Additive Modeling with Implicit Variable Selection by Likelihood-Based Boosting," Biometrics, The International Biometric Society, vol. 62(4), pages 961-971, December.
    4. Tutz, Gerhard & Binder, Harald, 2007. "Boosting ridge regression," Computational Statistics & Data Analysis, Elsevier, vol. 51(12), pages 6044-6059, August.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Battauz, Michela & Vidoni, Paolo, 2022. "A likelihood-based boosting algorithm for factor analysis models with binary data," Computational Statistics & Data Analysis, Elsevier, vol. 168(C).
    2. Yanis Tazi & Juan E. Arango-Ossa & Yangyu Zhou & Elsa Bernard & Ian Thomas & Amanda Gilkes & Sylvie Freeman & Yoann Pradat & Sean J. Johnson & Robert Hills & Richard Dillon & Max F. Levine & Daniel Le, 2022. "Unified classification and risk-stratification in Acute Myeloid Leukemia," Nature Communications, Nature, vol. 13(1), pages 1-16, December.
    3. Heidi Seibold & Christoph Bernau & Anne-Laure Boulesteix & Riccardo De Bin, 2018. "On the choice and influence of the number of boosting steps for high-dimensional linear Cox-models," Computational Statistics, Springer, vol. 33(3), pages 1195-1215, September.
    4. Riccardo De Bin & Vegard Grødem Stikbakke, 2023. "A boosting first-hitting-time model for survival analysis in high-dimensional settings," Lifetime Data Analysis: An International Journal Devoted to Statistical Methods and Applications for Time-to-Event Data, Springer, vol. 29(2), pages 420-440, April.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Marra, Giampiero & Wood, Simon N., 2011. "Practical variable selection for generalized additive models," Computational Statistics & Data Analysis, Elsevier, vol. 55(7), pages 2372-2387, July.
    2. Sariyar Murat & Schumacher Martin & Binder Harald, 2014. "A boosting approach for adapting the sparsity of risk prediction signatures based on different molecular levels," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 13(3), pages 343-357, June.
    3. Stefanie Hieke & Axel Benner & Richard F Schlenk & Martin Schumacher & Lars Bullinger & Harald Binder, 2016. "Identifying Prognostic SNPs in Clinical Cohorts: Complementing Univariate Analyses by Resampling and Multivariable Modeling," PLOS ONE, Public Library of Science, vol. 11(5), pages 1-18, May.
    4. Faisal Zahid & Gerhard Tutz, 2013. "Multinomial logit models with implicit variable selection," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 7(4), pages 393-416, December.
    5. Hainaut, Donatien & Trufin, Julien & Denuit, Michel, 2021. "Response versus gradient boosting trees, GLMs and neural networks under Tweedie loss and log-link," LIDAM Discussion Papers ISBA 2021012, Université catholique de Louvain, Institute of Statistics, Biostatistics and Actuarial Sciences (ISBA).
    6. Heidi Seibold & Christoph Bernau & Anne-Laure Boulesteix & Riccardo De Bin, 2018. "On the choice and influence of the number of boosting steps for high-dimensional linear Cox-models," Computational Statistics, Springer, vol. 33(3), pages 1195-1215, September.
    7. Tutz, Gerhard & Pößnecker, Wolfgang & Uhlmann, Lorenz, 2015. "Variable selection in general multinomial logit models," Computational Statistics & Data Analysis, Elsevier, vol. 82(C), pages 207-222.
    8. Bauer, Ida & Haupt, Harry & Linner, Stefan, 2024. "Pinball boosting of regression quantiles," Computational Statistics & Data Analysis, Elsevier, vol. 200(C).
    9. Robert Suchting & Michael S. Businelle & Stephen W. Hwang & Nikhil S. Padhye & Yijiong Yang & Diane M. Santa Maria, 2020. "Predicting Daily Sheltering Arrangements among Youth Experiencing Homelessness Using Diary Measurements Collected by Ecological Momentary Assessment," IJERPH, MDPI, vol. 17(18), pages 1-17, September.
    10. Li, Li & Li, Han & Panagiotelis, Anastasios, 2025. "Boosting domain-specific models with shrinkage: An application in mortality forecasting," International Journal of Forecasting, Elsevier, vol. 41(1), pages 191-207.
    11. Guilherme Lindenmeyer & Pedro Pablo Skorin & Hudson da Silva Torrent, 2021. "Using boosting for forecasting electric energy consumption during a recession: a case study for the Brazilian State Rio Grande do Sul," Letters in Spatial and Resource Sciences, Springer, vol. 14(2), pages 111-128, August.
    12. Simon N. Wood, 2020. "Inference and computation with generalized additive models and their extensions," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 29(2), pages 307-339, June.
    13. Mohamed Ouhourane & Yi Yang & Andréa L. Benedet & Karim Oualkacha, 2022. "Group penalized quantile regression," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 31(3), pages 495-529, September.
    14. Colin Griesbach & Andreas Groll & Elisabeth Bergherr, 2021. "Addressing cluster-constant covariates in mixed effects models via likelihood-based boosting techniques," PLOS ONE, Public Library of Science, vol. 16(7), pages 1-17, July.
    15. Riccardo De Bin & Vegard Grødem Stikbakke, 2023. "A boosting first-hitting-time model for survival analysis in high-dimensional settings," Lifetime Data Analysis: An International Journal Devoted to Statistical Methods and Applications for Time-to-Event Data, Springer, vol. 29(2), pages 420-440, April.
    16. Shafik, Nivien & Tutz, Gerhard, 2009. "Boosting nonlinear additive autoregressive time series," Computational Statistics & Data Analysis, Elsevier, vol. 53(7), pages 2453-2464, May.
    17. Wang Zhu & Wang C.Y., 2010. "Buckley-James Boosting for Survival Analysis with High-Dimensional Biomarker Data," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 9(1), pages 1-33, June.
    18. Kevin He & Ji Zhu & Jian Kang & Yi Li, 2022. "Stratified Cox models with time‐varying effects for national kidney transplant patients: A new blockwise steepest ascent method," Biometrics, The International Biometric Society, vol. 78(3), pages 1221-1232, September.
    19. Leitenstorfer, Florian & Tutz, Gerhard, 2007. "Knot selection by boosting techniques," Computational Statistics & Data Analysis, Elsevier, vol. 51(9), pages 4605-4621, May.
    20. Tino Werner, 2025. "Loss-guided stability selection," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 19(1), pages 5-30, March.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:compst:v:31:y:2016:i:2:d:10.1007_s00180-015-0642-2. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.