IDEAS home Printed from https://ideas.repec.org/a/spr/compst/v40y2025i2d10.1007_s00180-024-01506-0.html
   My bibliography  Save this article

Projection predictive variable selection for discrete response families with finite support

Author

Listed:
  • Frank Weber

    (Rostock University Medical Center)

  • Änne Glass

    (Rostock University Medical Center)

  • Aki Vehtari

    (Aalto University)

Abstract

The projection predictive variable selection is a decision-theoretically justified Bayesian variable selection approach achieving an outstanding trade-off between predictive performance and sparsity. Its projection problem is not easy to solve in general because it is based on the Kullback–Leibler divergence from a restricted posterior predictive distribution of the so-called reference model to the parameter-conditional predictive distribution of a candidate model. Previous work showed how this projection problem can be solved for response families employed in generalized linear models and how an approximate latent-space approach can be used for many other response families. Here, we present an exact projection method for all response families with discrete and finite support, called the augmented-data projection. A simulation study for an ordinal response family shows that the proposed method performs better than or similarly to the previously proposed approximate latent-space projection. The cost of the slightly better performance of the augmented-data projection is a substantial increase in runtime. Thus, if the augmented-data projection’s runtime is too high, we recommend the latent projection in the early phase of the model-building workflow and the augmented-data projection for final results. The ordinal response family from our simulation study is supported by both projection methods, but we also include a real-world cancer subtyping example with a nominal response family, a case that is not supported by the latent projection.

Suggested Citation

  • Frank Weber & Änne Glass & Aki Vehtari, 2025. "Projection predictive variable selection for discrete response families with finite support," Computational Statistics, Springer, vol. 40(2), pages 701-721, February.
  • Handle: RePEc:spr:compst:v:40:y:2025:i:2:d:10.1007_s00180-024-01506-0
    DOI: 10.1007/s00180-024-01506-0
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s00180-024-01506-0
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s00180-024-01506-0?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Scheipl, Fabian, 2011. "spikeSlabGAM: Bayesian Variable Selection, Model Choice and Regularization for Generalized Additive Mixed Models in R," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 43(i14).
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Riccardo (Jack) Lucchetti & Luca Pedini, 2020. "ParMA: Parallelised Bayesian Model Averaging for Generalised Linear Models," Working Papers 2020:28, Department of Economics, University of Venice "Ca' Foscari".
    2. Virginia X. He & Matt P. Wand, 2024. "Bayesian generalized additive model selection including a fast variational option," AStA Advances in Statistical Analysis, Springer;German Statistical Society, vol. 108(3), pages 639-668, September.
    3. Xin Fang & Bo Fang & Chunfang Wang & Tian Xia & Matteo Bottai & Fang Fang & Yang Cao, 2019. "Comparison of Frequentist and Bayesian Generalized Additive Models for Assessing the Association between Daily Exposure to Fine Particles and Respiratory Mortality: A Simulation Study," IJERPH, MDPI, vol. 16(5), pages 1-20, March.
    4. Jorge Castillo-Mateo & Jesús Asín & Ana C. Cebrián & Jesús Mateo-Lázaro & Jesús Abaurrea, 2023. "Bayesian Variable Selection in Generalized Extreme Value Regression: Modeling Annual Maximum Temperature," Mathematics, MDPI, vol. 11(3), pages 1-19, February.
    5. Benjamin Heuclin & Frédéric Mortier & Catherine Trottier & Marie Denis, 2021. "Bayesian varying coefficient model with selection: An application to functional mapping," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 70(1), pages 24-50, January.
    6. Umlauf, Nikolaus & Adler, Daniel & Kneib, Thomas & Lang, Stefan & Zeileis, Achim, 2015. "Structured Additive Regression Models: An R Interface to BayesX," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 63(i21).
    7. Goldsmith, Jeff & Scheipl, Fabian, 2014. "Estimator selection and combination in scalar-on-function regression," Computational Statistics & Data Analysis, Elsevier, vol. 70(C), pages 362-372.
    8. Li He & Yu-Bo Wang & William C. Bridges & Zhulin He & S. Megan Che, 2023. "Bayesian Framework for Causal Inference with Principal Stratification and Clusters," Statistics in Biosciences, Springer;International Chinese Statistical Association, vol. 15(1), pages 114-140, April.
    9. Yi Liu & Veronika Ročková & Yuexi Wang, 2021. "Variable selection with ABC Bayesian forests," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 83(3), pages 453-481, July.
    10. Chase N. Joyner & Christopher S. McMahan & Joshua M. Tebbs & Christopher R. Bilder, 2020. "From mixed effects modeling to spike and slab variable selection: A Bayesian regression model for group testing data," Biometrics, The International Biometric Society, vol. 76(3), pages 913-923, September.
    11. T. Rajala & D. J. Murrell & S. C. Olhede, 2018. "Detecting multivariate interactions in spatial point patterns with Gibbs models and variable selection," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 67(5), pages 1237-1273, November.
    12. Rachel Carroll & Andrew B. Lawson & Delia Voronca & Chawarat Rotejanaprasert & John E. Vena & Claire Marjorie Aelion & Diane L. Kamen, 2014. "Spatial Environmental Modeling of Autoantibody Outcomes among an African American Population," IJERPH, MDPI, vol. 11(3), pages 1-16, March.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:compst:v:40:y:2025:i:2:d:10.1007_s00180-024-01506-0. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.