Efficient sparse high-dimensional linear regression with a partitioned empirical Bayes ECM algorithm

My bibliography Save this article

Efficient sparse high-dimensional linear regression with a partitioned empirical Bayes ECM algorithm

Author

Listed:

McLain, Alexander C.
Zgodic, Anja
Bondell, Howard

Registered:

Abstract

Bayesian variable selection methods are powerful techniques for fitting sparse high-dimensional linear regression models. However, many are computationally intensive or require restrictive prior distributions on model parameters. A computationally efficient and powerful Bayesian approach is presented for sparse high-dimensional linear regression, requiring only minimal prior assumptions on parameters through plug-in empirical Bayes estimates of hyperparameters. The method employs a Parameter-Expanded Expectation-Conditional-Maximization (PX-ECM) algorithm to estimate maximum a posteriori (MAP) values of parameters via computationally efficient coordinate-wise optimization. The popular two-group approach to multiple testing motivates the E-step, resulting in a PaRtitiOned empirical Bayes Ecm (PROBE) algorithm for sparse high-dimensional linear regression. Both one-at-a-time and all-at-once optimization can be used to complete PROBE. Extensive simulation studies and analyses of cancer cell drug responses are conducted to compare PROBE's empirical properties with those of related methods. Implementation is available through the R package probe.

Suggested Citation

McLain, Alexander C. & Zgodic, Anja & Bondell, Howard, 2025. "Efficient sparse high-dimensional linear regression with a partitioned empirical Bayes ECM algorithm," Computational Statistics & Data Analysis, Elsevier, vol. 207(C).

Handle: RePEc:eee:csdana:v:207:y:2025:i:c:s0167947325000222
DOI: 10.1016/j.csda.2025.108146

Download full text from publisher

As the access to this document is restricted, you may want to

for a different version of it.

References listed on IDEAS

Carlos M. Carvalho & Nicholas G. Polson & James G. Scott, 2010. "The horseshoe estimator for sparse signals," Biometrika, Biometrika Trust, vol. 97(2), pages 465-480.
Ravi Varadhan & Christophe Roland, 2008. "Simple and Globally Convergent Methods for Accelerating the Convergence of Any EM Algorithm," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 35(2), pages 335-353, June.
Veronika Ročková & Edward I. George, 2014. "EMVS: The EM Approach to Bayesian Variable Selection," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 109(506), pages 828-846, June.
D. Oakes, 1999. "Direct calculation of the information matrix via the EM," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 61(2), pages 479-482, April.
Sun, Wenguang & Cai, T. Tony, 2007. "Oracle and Adaptive Compound Decision Rules for False Discovery Rate Control," Journal of the American Statistical Association, American Statistical Association, vol. 102, pages 901-912, September.
Howard D. Bondell & Brian J. Reich, 2012. "Consistent High-Dimensional Bayesian Variable Selection via Penalized Credible Regions," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 107(500), pages 1610-1624, December.
D. Leventhal & A. S. Lewis, 2010. "Randomized Methods for Linear Constraints: Convergence Rates and Conditioning," Mathematics of Operations Research, INFORMS, vol. 35(3), pages 641-654, August.
Fan J. & Li R., 2001. "Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties," Journal of the American Statistical Association, American Statistical Association, vol. 96, pages 1348-1360, December.
Veronika Ročková & Edward I. George, 2018. "The Spike-and-Slab LASSO," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 113(521), pages 431-444, January.
Liang, Feng & Paulo, Rui & Molina, German & Clyde, Merlise A. & Berger, Jim O., 2008. "Mixtures of g Priors for Bayesian Variable Selection," Journal of the American Statistical Association, American Statistical Association, vol. 103, pages 410-423, March.
Jin, Jiashun & Cai, T. Tony, 2007. "Estimating the Null and the Proportion of Nonnull Effects in Large-Scale Multiple Comparisons," Journal of the American Statistical Association, American Statistical Association, vol. 102, pages 495-506, June.
Zou, Hui, 2006. "The Adaptive Lasso and Its Oracle Properties," Journal of the American Statistical Association, American Statistical Association, vol. 101, pages 1418-1429, December.
David M. Blei & Alp Kucukelbir & Jon D. McAuliffe, 2017. "Variational Inference: A Review for Statisticians," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 112(518), pages 859-877, April.
Eddelbuettel, Dirk & Sanderson, Conrad, 2014. "RcppArmadillo: Accelerating R with high-performance C++ linear algebra," Computational Statistics & Data Analysis, Elsevier, vol. 71(C), pages 1054-1063.
Jordi Barretina & Giordano Caponigro & Nicolas Stransky & Kavitha Venkatesan & Adam A. Margolin & Sungjoon Kim & Christopher J.Wilson & Joseph Lehár & Gregory V. Kryukov & Dmitriy Sonkin & Anupama Red, 2012. "Addendum: The Cancer Cell Line Encyclopedia enables predictive modelling of anticancer drug sensitivity," Nature, Nature, vol. 492(7428), pages 290-290, December.
M. Jamshidian & R. I. Jennrich, 2000. "Standard errors for EM estimation," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 62(2), pages 257-270.
Friedman, Jerome H. & Hastie, Trevor & Tibshirani, Rob, 2010. "Regularization Paths for Generalized Linear Models via Coordinate Descent," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 33(i01).
Veronika Ročková, 2018. "Particle EM for Variable Selection," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 113(524), pages 1684-1697, October.
Jordi Barretina & Giordano Caponigro & Nicolas Stransky & Kavitha Venkatesan & Adam A. Margolin & Sungjoon Kim & Christopher J. Wilson & Joseph Lehár & Gregory V. Kryukov & Dmitriy Sonkin & Anupama Re, 2012. "The Cancer Cell Line Encyclopedia enables predictive modelling of anticancer drug sensitivity," Nature, Nature, vol. 483(7391), pages 603-607, March.
John D. Storey & Jonathan E. Taylor & David Siegmund, 2004. "Strong control, conservative point estimation and simultaneous conservative consistency of false discovery rates: a unified approach," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 66(1), pages 187-205, February.
Gao Wang & Abhishek Sarkar & Peter Carbonetto & Matthew Stephens, 2020. "A simple new approach to variable selection in regression, with application to genetic fine mapping," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 82(5), pages 1273-1300, December.
Kolyan Ray & Botond Szabó, 2022. "Variational Bayes for High-Dimensional Linear Regression With Sparse Priors," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 117(539), pages 1270-1281, September.
Kloek, T. & Lempers, F. B., "undated". "Posterior Probabilities Of Alternative Linear Models," Econometric Institute Archives 272033, Erasmus University Rotterdam.

Full references (including those not matched with items on IDEAS)

Most related items

These are the items that most often cite the same works as this one and are cited by the same works as this one.

Posch, Konstantin & Arbeiter, Maximilian & Pilz, Juergen, 2020. "A novel Bayesian approach for variable selection in linear regression models," Computational Statistics & Data Analysis, Elsevier, vol. 144(C).
Dimitris Korobilis & Kenichi Shimizu, 2022. "Bayesian Approaches to Shrinkage and Sparse Estimation," Foundations and Trends(R) in Econometrics, now publishers, vol. 11(4), pages 230-354, June.
- Korobilis, Dimitris & Shimizu, Kenichi, 2021. "Bayesian Approaches to Shrinkage and Sparse Estimation," MPRA Paper 111631, University Library of Munich, Germany.
- Dimitris Korobilis & Kenichi Shimizu, 2022. "Bayesian Approaches to Shrinkage and Sparse Estimation," Working Paper series 22-02, Rimini Centre for Economic Analysis.
- Dimitris Korobilis & Kenichi Shimizu, 2021. "Bayesian Approaches to Shrinkage and Sparse Estimation," Papers 2112.11751, arXiv.org.
- Dimitris Korobilis & Kenichi Shimizu, 2021. "Bayesian Approaches to Shrinkage and Sparse Estimation," Working Papers 2021_19, Business School - Economics, University of Glasgow.
Ander Wilson & Brian J. Reich, 2014. "Confounder selection via penalized credible regions," Biometrics, The International Biometric Society, vol. 70(4), pages 852-861, December.
Bernardi, Mauro & Costola, Michele, 2019. "High-dimensional sparse financial networks through a regularised regression model," SAFE Working Paper Series 244, Leibniz Institute for Financial Research SAFE.
Mark F. J. Steel, 2020. "Model Averaging and Its Use in Economics," Journal of Economic Literature, American Economic Association, vol. 58(3), pages 644-719, September.
- Steel, Mark F. J., 2017. "Model Averaging and its Use in Economics," MPRA Paper 81568, University Library of Munich, Germany.
- Steel, Mark F. J., 2017. "Model Averaging and its Use in Economics," MPRA Paper 90110, University Library of Munich, Germany, revised 16 Nov 2018.
Qin, Shanshan & Zhang, Guanlin & Wu, Yuehua & Zhu, Zhongyi, 2025. "Bayesian grouping-Gibbs sampling estimation of high-dimensional linear model with non-sparsity," Computational Statistics & Data Analysis, Elsevier, vol. 203(C).
Tanin Sirimongkolkasem & Reza Drikvandi, 2019. "On Regularisation Methods for Analysis of High Dimensional Data," Annals of Data Science, Springer, vol. 6(4), pages 737-763, December.
van Erp, Sara & Oberski, Daniel L. & Mulder, Joris, 2018. "Shrinkage priors for Bayesian penalized regression," OSF Preprints cg8fq, Center for Open Science.
Matthew Pietrosanu & Jueyu Gao & Linglong Kong & Bei Jiang & Di Niu, 2021. "Advanced algorithms for penalized quantile and composite quantile regression," Computational Statistics, Springer, vol. 36(1), pages 333-346, March.
repec:osf:osfxxx:cg8fq_v1 is not listed on IDEAS
Jieun Lee & Gyuhyeong Goh, 2024. "A hybrid deterministic–deterministic approach for high-dimensional Bayesian variable selection with a default prior," Computational Statistics, Springer, vol. 39(3), pages 1659-1681, May.
Juanjuan Zhang & Weixian Wang & Mingming Yang & Maozai Tian, 2025. "Variational Bayesian Variable Selection in Logistic Regression Based on Spike-and-Slab Lasso," Mathematics, MDPI, vol. 13(13), pages 1-18, July.
Matthew Gentzkow & Bryan T. Kelly & Matt Taddy, 2017. "Text as Data," NBER Working Papers 23276, National Bureau of Economic Research, Inc.
Michael Bergrab & Christian Aßmann, 2024. "Automated Bayesian variable selection methods for binary regression models with missing covariate data," AStA Wirtschafts- und Sozialstatistisches Archiv, Springer;Deutsche Statistische Gesellschaft - German Statistical Society, vol. 18(2), pages 203-244, June.
Li, Hanning & Pati, Debdeep, 2017. "Variable selection using shrinkage priors," Computational Statistics & Data Analysis, Elsevier, vol. 107(C), pages 107-119.
Sierra A. Bainter & Thomas G. McCauley & Mahmoud M. Fahmy & Zachary T. Goodman & Lauren B. Kupis & J. Sunil Rao, 2023. "Comparing Bayesian Variable Selection to Lasso Approaches for Applications in Psychology," Psychometrika, Springer;The Psychometric Society, vol. 88(3), pages 1032-1055, September.
Tutz, Gerhard & Pößnecker, Wolfgang & Uhlmann, Lorenz, 2015. "Variable selection in general multinomial logit models," Computational Statistics & Data Analysis, Elsevier, vol. 82(C), pages 207-222.
Naimoli, Antonio, 2022. "Modelling the persistence of Covid-19 positivity rate in Italy," Socio-Economic Planning Sciences, Elsevier, vol. 82(PA).
N. Neykov & P. Filzmoser & P. Neytchev, 2014. "Ultrahigh dimensional variable selection through the penalized maximum trimmed likelihood estimator," Statistical Papers, Springer, vol. 55(1), pages 187-207, February.
- N. Neykov & P. Filzmoser & P. Neytchev, 2014. "Erratum to: Ultrahigh dimensional variable selection through the penalized maximum trimmed likelihood estimator," Statistical Papers, Springer, vol. 55(3), pages 917-918, August.
Peter Bühlmann & Jacopo Mandozzi, 2014. "High-dimensional variable screening and bias in subsequent inference, with an empirical comparison," Computational Statistics, Springer, vol. 29(3), pages 407-430, June.
Capanu, Marinela & Giurcanu, Mihai & Begg, Colin B. & Gönen, Mithat, 2023. "Subsampling based variable selection for generalized linear models," Computational Statistics & Data Analysis, Elsevier, vol. 184(C).

More about this item

Keywords

; ; ; ; ;

Statistics

Access and download statistics

Corrections

All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:csdana:v:207:y:2025:i:c:s0167947325000222. See general information about how to correct material in RePEc.

If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/csda .

Please note that corrections may take a couple of weeks to filter through the various RePEc services.

IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.

Browse Econ Literature

More features

Efficient sparse high-dimensional linear regression with a partitioned empirical Bayes ECM algorithm

Author

Abstract

Suggested Citation

Download full text from publisher

References listed on IDEAS

Most related items

More about this item

Keywords

Statistics

Corrections

More services and features

MyIDEAS

Author registration

Rankings

RePEc Genealogy

RePEc Biblio

MPRA

New papers by email

EconAcademics

Plagiarism

About RePEc

RePEc home

Blog

Help/FAQ

RePEc team

Participating archives

Privacy statement

Help us

Corrections

Volunteers

Get papers listed

Open a RePEc archive

Get RePEc data