IDEAS home Printed from https://ideas.repec.org/a/gam/jmathe/v11y2023i2p411-d1034015.html
   My bibliography  Save this article

Variable Selection and Allocation in Joint Models via Gradient Boosting Techniques

Author

Listed:
  • Colin Griesbach

    (Chair of Spatial Data Science and Statistical Learning, Georg-August-Universität Göttingen, 37073 Göttingen, Germany)

  • Andreas Mayr

    (Department of Medical Biometrics, Informatics and Epidemiology, University Hospital Bonn, 53127 Bonn, Germany)

  • Elisabeth Bergherr

    (Chair of Spatial Data Science and Statistical Learning, Georg-August-Universität Göttingen, 37073 Göttingen, Germany)

Abstract

Modeling longitudinal data (e.g., biomarkers) and the risk for events separately leads to a loss of information and bias, even though the underlying processes are related to each other. Hence, the popularity of joint models for longitudinal and time-to-event-data has grown rapidly in the last few decades. However, it is quite a practical challenge to specify which part of a joint model the single covariates should be assigned to as this decision usually has to be made based on background knowledge. In this work, we combined recent developments from the field of gradient boosting for distributional regression in order to construct an allocation routine allowing researchers to automatically assign covariates to the single sub-predictors of a joint model. The procedure provides several well-known advantages of model-based statistical learning tools, as well as a fast-performing allocation mechanism for joint models, which is illustrated via empirical results from a simulation study and a biomedical application.

Suggested Citation

  • Colin Griesbach & Andreas Mayr & Elisabeth Bergherr, 2023. "Variable Selection and Allocation in Joint Models via Gradient Boosting Techniques," Mathematics, MDPI, vol. 11(2), pages 1-16, January.
  • Handle: RePEc:gam:jmathe:v:11:y:2023:i:2:p:411-:d:1034015
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2227-7390/11/2/411/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2227-7390/11/2/411/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Rizopoulos, Dimitris, 2010. "JM: An R Package for the Joint Modelling of Longitudinal and Time-to-Event Data," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 35(i09).
    2. Zangdong He & Wanzhu Tu & Sijian Wang & Haoda Fu & Zhangsheng Yu, 2015. "Simultaneous variable selection for joint models of longitudinal and survival outcomes," Biometrics, The International Biometric Society, vol. 71(1), pages 178-187, March.
    3. Thomas Kneib & Torsten Hothorn & Gerhard Tutz, 2009. "Variable Selection and Model Choice in Geoadditive Regression Models," Biometrics, The International Biometric Society, vol. 65(2), pages 626-634, June.
    4. Fengting Yi & Niansheng Tang & Jianguo Sun, 2022. "Simultaneous variable selection and estimation for joint models of longitudinal and failure time data with interval censoring," Biometrics, The International Biometric Society, vol. 78(1), pages 151-164, March.
    5. Boyao Zhang & Tobias Hepp & Sonja Greven & Elisabeth Bergherr, 2022. "Adaptive step-length selection in gradient boosting for Gaussian location and scale models," Computational Statistics, Springer, vol. 37(5), pages 2295-2332, November.
    6. Andreas Mayr & Nora Fenske & Benjamin Hofner & Thomas Kneib & Matthias Schmid, 2012. "Generalized additive models for location, scale and shape for high dimensional data—a flexible approach based on boosting," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 61(3), pages 403-427, May.
    7. Gerhard Tutz & Harald Binder, 2006. "Generalized Additive Modeling with Implicit Variable Selection by Likelihood-Based Boosting," Biometrics, The International Biometric Society, vol. 62(4), pages 961-971, December.
    8. Bissantz, Nicolai & Hohage, T. & Munk, Axel & Ruymgaart, F., 2007. "Convergence rates of general regularization methods for statistical inverse problems and applications," Technical Reports 2007,04, Technische Universität Dortmund, Sonderforschungsbereich 475: Komplexitätsreduktion in multivariaten Datenstrukturen.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Philip Kostov, 2010. "Do Buyers’ Characteristics and Personal Relationships Affect Agricultural Land Prices?," Land Economics, University of Wisconsin Press, vol. 86(1), pages 48-65.
    2. Hendrik van der Wurp & Andreas Groll, 2023. "Introducing LASSO-type penalisation to generalised joint regression modelling for count data," AStA Advances in Statistical Analysis, Springer;German Statistical Society, vol. 107(1), pages 127-151, March.
    3. Fabian Scheipl & Thomas Kneib & Ludwig Fahrmeir, 2013. "Penalized likelihood and Bayesian function selection in regression models," AStA Advances in Statistical Analysis, Springer;German Statistical Society, vol. 97(4), pages 349-385, October.
    4. Simon N. Wood, 2020. "Inference and computation with generalized additive models and their extensions," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 29(2), pages 307-339, June.
    5. Colin Griesbach & Andreas Groll & Elisabeth Bergherr, 2021. "Addressing cluster-constant covariates in mixed effects models via likelihood-based boosting techniques," PLOS ONE, Public Library of Science, vol. 16(7), pages 1-17, July.
    6. Benjamin Hofner & Torsten Hothorn & Thomas Kneib, 2013. "Variable selection and model choice in structured survival models," Computational Statistics, Springer, vol. 28(3), pages 1079-1101, June.
    7. Tutz, Gerhard & Pößnecker, Wolfgang & Uhlmann, Lorenz, 2015. "Variable selection in general multinomial logit models," Computational Statistics & Data Analysis, Elsevier, vol. 82(C), pages 207-222.
    8. Hongyuan Cao & Mathew M. Churpek & Donglin Zeng & Jason P. Fine, 2015. "Analysis of the Proportional Hazards Model With Sparse Longitudinal Covariates," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 110(511), pages 1187-1196, September.
    9. Benjamin Hofner & Andreas Mayr & Nikolay Robinzonov & Matthias Schmid, 2014. "Model-based boosting in R: a hands-on tutorial using the R package mboost," Computational Statistics, Springer, vol. 29(1), pages 3-35, February.
    10. Xiaohong Chen & Demian Pouzo, 2012. "Estimation of Nonparametric Conditional Moment Models With Possibly Nonsmooth Generalized Residuals," Econometrica, Econometric Society, vol. 80(1), pages 277-321, January.
    11. Riccardo De Bin & Vegard Grødem Stikbakke, 2023. "A boosting first-hitting-time model for survival analysis in high-dimensional settings," Lifetime Data Analysis: An International Journal Devoted to Statistical Methods and Applications for Time-to-Event Data, Springer, vol. 29(2), pages 420-440, April.
    12. Johannes, Jan & Van Bellegem, Sébastien & Vanhems, Anne, 2011. "Convergence Rates For Ill-Posed Inverse Problems With An Unknown Operator," Econometric Theory, Cambridge University Press, vol. 27(3), pages 522-545, June.
    13. Zhang, Zili & Charalambous, Christiana & Foster, Peter, 2023. "A Gaussian copula joint model for longitudinal and time-to-event data with random effects," Computational Statistics & Data Analysis, Elsevier, vol. 181(C).
    14. Andrews, Donald W.K., 2017. "Examples of L2-complete and boundedly-complete distributions," Journal of Econometrics, Elsevier, vol. 199(2), pages 213-220.
    15. Oi, Katsuya, 2020. "Disuse as time away from a cognitively demanding job; how does it temporally or developmentally impact late-life cognition?," Intelligence, Elsevier, vol. 82(C).
    16. Xin Fang & Bo Fang & Chunfang Wang & Tian Xia & Matteo Bottai & Fang Fang & Yang Cao, 2019. "Comparison of Frequentist and Bayesian Generalized Additive Models for Assessing the Association between Daily Exposure to Fine Particles and Respiratory Mortality: A Simulation Study," IJERPH, MDPI, vol. 16(5), pages 1-20, March.
    17. Elisabeth Waldmann & Thomas Kneib & Yu Ryan Yu & Stefan Lang, 2012. "Bayesian semiparametric additive quantile regression," Working Papers 2012-06, Faculty of Economics and Statistics, Universität Innsbruck.
    18. Dimitris Rizopoulos, 2011. "Dynamic Predictions and Prospective Accuracy in Joint Models for Longitudinal and Time-to-Event Data," Biometrics, The International Biometric Society, vol. 67(3), pages 819-829, September.
    19. Karl, Andrew T. & Yang, Yan & Lohr, Sharon L., 2014. "Computation of maximum likelihood estimates for multiresponse generalized linear mixed models with non-nested, correlated random effects," Computational Statistics & Data Analysis, Elsevier, vol. 73(C), pages 146-162.
    20. Hofner, Benjamin & Mayr, Andreas & Schmid, Matthias, 2016. "gamboostLSS: An R Package for Model Building and Variable Selection in the GAMLSS Framework," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 74(i01).

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jmathe:v:11:y:2023:i:2:p:411-:d:1034015. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.