Consistent High-Dimensional Bayesian Variable Selection via Penalized Credible Regions

My bibliography Save this article

Consistent High-Dimensional Bayesian Variable Selection via Penalized Credible Regions

Author

Listed:

Howard D. Bondell
Brian J. Reich

Registered:

Abstract

For high-dimensional data, particularly when the number of predictors greatly exceeds the sample size, selection of relevant predictors for regression is a challenging problem. Methods such as sure screening, forward selection, or penalized regressions are commonly used. Bayesian variable selection methods place prior distributions on the parameters along with a prior over model space, or equivalently, a mixture prior on the parameters having mass at zero. Since exhaustive enumeration is not feasible, posterior model probabilities are often obtained via long Markov chain Monte Carlo (MCMC) runs. The chosen model can depend heavily on various choices for priors and also posterior thresholds. Alternatively, we propose a conjugate prior only on the full model parameters and use sparse solutions within posterior credible regions to perform selection. These posterior credible regions often have closed-form representations, and it is shown that these sparse solutions can be computed via existing algorithms. The approach is shown to outperform common methods in the high-dimensional setting, particularly under correlation. By searching for a sparse solution within a joint credible region, consistent model selection is established. Furthermore, it is shown that, under certain conditions, the use of marginal credible intervals can give consistent selection up to the case where the dimension grows exponentially in the sample size. The proposed approach successfully accomplishes variable selection in the high-dimensional setting, while avoiding pitfalls that plague typical Bayesian variable selection methods.

Suggested Citation

Howard D. Bondell & Brian J. Reich, 2012. "Consistent High-Dimensional Bayesian Variable Selection via Penalized Credible Regions," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 107(500), pages 1610-1624, December.

Handle: RePEc:taf:jnlasa:v:107:y:2012:i:500:p:1610-1624
DOI: 10.1080/01621459.2012.716344

Download full text from publisher

As the access to this document is restricted, you may want to search for a different version of it.

References listed on IDEAS

Gareth M. James & Peter Radchenko & Jinchi Lv, 2009. "DASSO: connections between the Dantzig selector and lasso," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 71(1), pages 127-142, January.
Fan J. & Li R., 2001. "Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties," Journal of the American Statistical Association, American Statistical Association, vol. 96, pages 1348-1360, December.
Satkartar K. Kinney & David B. Dunson, 2007. "Fixed and Random Effects Selection in Linear and Logistic Models," Biometrics, The International Biometric Society, vol. 63(3), pages 690-698, September.
Liang, Feng & Paulo, Rui & Molina, German & Clyde, Merlise A. & Berger, Jim O., 2008. "Mixtures of g Priors for Bayesian Variable Selection," Journal of the American Statistical Association, American Statistical Association, vol. 103, pages 410-423, March.
Zou, Hui, 2006. "The Adaptive Lasso and Its Oracle Properties," Journal of the American Statistical Association, American Statistical Association, vol. 101, pages 1418-1429, December.
Robert Tibshirani & Michael Saunders & Saharon Rosset & Ji Zhu & Keith Knight, 2005. "Sparsity and smoothness via the fused lasso," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 67(1), pages 91-108, February.
P. J. Brown & M. Vannucci & T. Fearn, 2002. "Bayes model averaging with selection of regressors," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 64(3), pages 519-536, August.
Dunson, David B. & Herring, Amy H. & Engel, Stephanie M., 2008. "Bayesian Selection and Clustering of Polymorphisms in Functionally Related Genes," Journal of the American Statistical Association, American Statistical Association, vol. 103, pages 534-546, June.
Jianqing Fan & Jinchi Lv, 2008. "Sure independence screening for ultrahigh dimensional feature space," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 70(5), pages 849-911, November.
Hui Zou & Trevor Hastie, 2005. "Addendum: Regularization and variable selection via the elastic net," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 67(5), pages 768-768, November.
Tadesse, Mahlet G. & Sha, Naijun & Vannucci, Marina, 2005. "Bayesian Variable Selection in Clustering High-Dimensional Data," Journal of the American Statistical Association, American Statistical Association, vol. 100, pages 602-617, June.
Wang, Hansheng, 2009. "Forward Regression for Ultra-High Dimensional Variable Screening," Journal of the American Statistical Association, American Statistical Association, vol. 104(488), pages 1512-1524.
Hui Zou & Trevor Hastie, 2005. "Regularization and variable selection via the elastic net," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 67(2), pages 301-320, April.
Ming Yuan & Yi Lin, 2006. "Model selection and estimation in regression with grouped variables," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 68(1), pages 49-67, February.

Full references (including those not matched with items on IDEAS)

Citations

Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.

Cited by:

Minerva Mukhopadhyay & Tapas Samanta, 2017. "A mixture of g-priors for variable selection when the number of regressors grows with the sample size," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 26(2), pages 377-404, June.
Qin, Shanshan & Zhang, Guanlin & Wu, Yuehua & Zhu, Zhongyi, 2025. "Bayesian grouping-Gibbs sampling estimation of high-dimensional linear model with non-sparsity," Computational Statistics & Data Analysis, Elsevier, vol. 203(C).
Xiao Fang & Malay Ghosh, 2024. "High-dimensional properties for empirical priors in linear regression with unknown error variance," Statistical Papers, Springer, vol. 65(1), pages 237-262, February.
Ander Wilson & Brian J. Reich, 2014. "Confounder selection via penalized credible regions," Biometrics, The International Biometric Society, vol. 70(4), pages 852-861, December.
Changsheng Liu & Hanying Liang & Yongmei Li, 2025. "Bayesian quantile regression for partially linear single-index model with longitudinal data," Statistical Papers, Springer, vol. 66(1), pages 1-51, January.
Li, Hanning & Pati, Debdeep, 2017. "Variable selection using shrinkage priors," Computational Statistics & Data Analysis, Elsevier, vol. 107(C), pages 107-119.
Tang, Niansheng & Yan, Xiaodong & Zhao, Puying, 2018. "Exponentially tilted likelihood inference on growing dimensional unconditional moment models," Journal of Econometrics, Elsevier, vol. 202(1), pages 57-74.
Bakerman, Jordan & Pazdernik, Karl & Korkmaz, Gizem & Wilson, Alyson G., 2022. "Dynamic logistic regression and variable selection: Forecasting and contextualizing civil unrest," International Journal of Forecasting, Elsevier, vol. 38(2), pages 648-661.
Kyoungjae Lee & Xuan Cao, 2021. "Bayesian group selection in logistic regression with application to MRI data analysis," Biometrics, The International Biometric Society, vol. 77(2), pages 391-400, June.

Most related items

These are the items that most often cite the same works as this one and are cited by the same works as this one.

Loann David Denis Desboulets, 2018. "A Review on Variable Selection in Regression Analysis," Econometrics, MDPI, vol. 6(4), pages 1-27, November.
- Loann David Denis Desboulets, 2018. "A Review on Variable Selection in Regression Analysis," Post-Print hal-01954386, HAL.
Liming Wang & Xingxiang Li & Xiaoqing Wang & Peng Lai, 2022. "Unified mean-variance feature screening for ultrahigh-dimensional regression," Computational Statistics, Springer, vol. 37(4), pages 1887-1918, September.
Christis Katsouris, 2023. "High Dimensional Time Series Regression Models: Applications to Statistical Learning Methods," Papers 2308.16192, arXiv.org.
Gerda Claeskens, 2012. "Focused estimation and model averaging with penalization methods: an overview," Statistica Neerlandica, Netherlands Society for Statistics and Operations Research, vol. 66(3), pages 272-287, August.
Jiang, He & Luo, Shihua & Dong, Yao, 2021. "Simultaneous feature selection and clustering based on square root optimization," European Journal of Operational Research, Elsevier, vol. 289(1), pages 214-231.
Xiangyu Wang & Chenlei Leng, 2016. "High dimensional ordinary least squares projection for screening variables," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 78(3), pages 589-611, June.
Diego Vidaurre & Concha Bielza & Pedro Larrañaga, 2013. "A Survey of L1 Regression," International Statistical Review, International Statistical Institute, vol. 81(3), pages 361-387, December.
Yen-Shiu Chin & Ting-Li Chen, 2016. "Minimizing variable selection criteria by Markov chain Monte Carlo," Computational Statistics, Springer, vol. 31(4), pages 1263-1286, December.
Wei Sun & Lexin Li, 2012. "Multiple Loci Mapping via Model-free Variable Selection," Biometrics, The International Biometric Society, vol. 68(1), pages 12-22, March.
Tutz, Gerhard & Pößnecker, Wolfgang & Uhlmann, Lorenz, 2015. "Variable selection in general multinomial logit models," Computational Statistics & Data Analysis, Elsevier, vol. 82(C), pages 207-222.
Victor Chernozhukov & Christian Hansen & Yuan Liao, 2015. "A lava attack on the recovery of sums of dense and sparse signals," CeMMAP working papers CWP56/15, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
- Victor Chernozhukov & Christian Hansen & Yuan Liao, 2015. "A lava attack on the recovery of sums of dense and sparse signals," CeMMAP working papers 56/15, Institute for Fiscal Studies.
- Victor Chernozhukov & Christian Hansen & Yuan Liao, 2015. "A lava attack on the recovery of sums of dense and sparse signals," CeMMAP working papers CWP05/15, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
- Victor Chernozhukov & Christian Hansen & Yuan Liao, 2015. "A lava attack on the recovery of sums of dense and sparse signals," Papers 1502.03155, arXiv.org, revised Mar 2015.
- Victor Chernozhukov & Christian Hansen & Yuan Liao, 2015. "A lava attack on the recovery of sums of dense and sparse signals," CeMMAP working papers 05/15, Institute for Fiscal Studies.
Zhang, Tonglin, 2024. "Variables selection using L0 penalty," Computational Statistics & Data Analysis, Elsevier, vol. 190(C).
Takumi Saegusa & Tianzhou Ma & Gang Li & Ying Qing Chen & Mei-Ling Ting Lee, 2020. "Variable Selection in Threshold Regression Model with Applications to HIV Drug Adherence Data," Statistics in Biosciences, Springer;International Chinese Statistical Association, vol. 12(3), pages 376-398, December.
Pei Wang & Shunjie Chen & Sijia Yang, 2022. "Recent Advances on Penalized Regression Models for Biological Data," Mathematics, MDPI, vol. 10(19), pages 1-24, October.
Ricardo P. Masini & Marcelo C. Medeiros & Eduardo F. Mendes, 2023. "Machine learning advances for time series forecasting," Journal of Economic Surveys, Wiley Blackwell, vol. 37(1), pages 76-111, February.
- Ricardo P. Masini & Marcelo C. Medeiros & Eduardo F. Mendes, 2020. "Machine Learning Advances for Time Series Forecasting," Papers 2012.12802, arXiv.org, revised Apr 2021.
Korobilis, Dimitris, 2013. "Hierarchical shrinkage priors for dynamic regressions with many predictors," International Journal of Forecasting, Elsevier, vol. 29(1), pages 43-59.
- Dimitris Korobilis, 2011. "Hierarchical Shrinkage Priors for Dynamic Regressions with Many Predictors," Working Paper series 21_11, Rimini Centre for Economic Analysis.
- Korobilis, Dimitris, 2011. "Hierarchical shrinkage priors for dynamic regressions with many predictors," MPRA Paper 30380, University Library of Munich, Germany.
- KOROBILIS, Dimitris, 2011. "Hierarchical shrinkage priors for dynamic regressions with many predictors," LIDAM Discussion Papers CORE 2011021, Université catholique de Louvain, Center for Operations Research and Econometrics (CORE).
Dai, Linlin & Chen, Kani & Sun, Zhihua & Liu, Zhenqiu & Li, Gang, 2018. "Broken adaptive ridge regression and its asymptotic properties," Journal of Multivariate Analysis, Elsevier, vol. 168(C), pages 334-351.
Ruggieri, Eric & Lawrence, Charles E., 2012. "On efficient calculations for Bayesian variable selection," Computational Statistics & Data Analysis, Elsevier, vol. 56(6), pages 1319-1332.
Kwon, Sunghoon & Oh, Seungyoung & Lee, Youngjo, 2016. "The use of random-effect models for high-dimensional variable selection problems," Computational Statistics & Data Analysis, Elsevier, vol. 103(C), pages 401-412.
Ismail Shah & Hina Naz & Sajid Ali & Amani Almohaimeed & Showkat Ahmad Lone, 2023. "A New Quantile-Based Approach for LASSO Estimation," Mathematics, MDPI, vol. 11(6), pages 1-13, March.

More about this item

Statistics

Access and download statistics

Corrections

All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:taf:jnlasa:v:107:y:2012:i:500:p:1610-1624. See general information about how to correct material in RePEc.

If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Chris Longhurst (email available below). General contact details of provider: http://www.tandfonline.com/UASA20 .

Please note that corrections may take a couple of weeks to filter through the various RePEc services.

IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.

Browse Econ Literature

More features

Consistent High-Dimensional Bayesian Variable Selection via Penalized Credible Regions

Author

Abstract

Suggested Citation

Download full text from publisher

References listed on IDEAS

Citations

Most related items

More about this item

Statistics

Corrections

More services and features

MyIDEAS

Author registration

Rankings

RePEc Genealogy

RePEc Biblio

MPRA

New papers by email

EconAcademics

Plagiarism

About RePEc

RePEc home

Blog

Help/FAQ

RePEc team

Participating archives

Privacy statement

Help us

Corrections

Volunteers

Get papers listed

Open a RePEc archive

Get RePEc data