IDEAS home Printed from https://ideas.repec.org/a/spr/metrik/v88y2025i5d10.1007_s00184-024-00975-z.html
   My bibliography  Save this article

Sparsified simultaneous confidence intervals for high-dimensional linear models

Author

Listed:
  • Xiaorui Zhu

    (Towson University)

  • Yichen Qin

    (University of Cincinnati)

  • Peng Wang

    (University of Cincinnati)

Abstract

Statistical inference of the high-dimensional regression coefficients is challenging because the uncertainty introduced by the model selection procedure is hard to account for. Currently, the inference of the model and the inference of the coefficients are separately sought. A critical question remains unsettled; that is, is it possible to embed the inference of the model into the simultaneous inference of the coefficients? If so, then how to properly design a simultaneous inference tool with desired properties? To this end, we propose a notion of simultaneous confidence intervals called the sparsified simultaneous confidence intervals (SSCI). Our intervals are sparse in the sense that some of the intervals’ upper and lower bounds are shrunken to zero (i.e., [0, 0]), indicating the unimportance of the corresponding covariates. These covariates should be excluded from the final model. The rest of the intervals, either containing zero (e.g., $$[-1,1]$$ [ - 1 , 1 ] or [0, 1]) or not containing zero (e.g., [2, 3]), indicate the plausible and significant covariates, respectively. The SSCI intuitively suggests a lower-bound model with significant covariates only and an upper-bound model with plausible and significant covariates. The proposed method can be coupled with various selection procedures, making it ideal for comparing their uncertainty. For the proposed method, we establish desirable asymptotic properties, develop intuitive graphical tools for visualization, and justify its superior performance through simulation and real data analysis.

Suggested Citation

  • Xiaorui Zhu & Yichen Qin & Peng Wang, 2025. "Sparsified simultaneous confidence intervals for high-dimensional linear models," Metrika: International Journal for Theoretical and Applied Statistics, Springer, vol. 88(5), pages 709-733, July.
  • Handle: RePEc:spr:metrik:v:88:y:2025:i:5:d:10.1007_s00184-024-00975-z
    DOI: 10.1007/s00184-024-00975-z
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s00184-024-00975-z
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s00184-024-00975-z?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to

    for a different version of it.

    References listed on IDEAS

    as
    1. Panxu Yuan & Xiao Guo, 2022. "High-dimensional inference for linear model with correlated errors," Metrika: International Journal for Theoretical and Applied Statistics, Springer, vol. 85(1), pages 21-52, January.
    2. Yang Li & Yuetian Luo & Davide Ferrari & Xiaonan Hu & Yichen Qin, 2019. "Rejoinder to Discussions on: Model confidence bounds for variable selection," Biometrics, The International Biometric Society, vol. 75(2), pages 411-413, June.
    3. Zou, Hui, 2006. "The Adaptive Lasso and Its Oracle Properties," Journal of the American Statistical Association, American Statistical Association, vol. 101, pages 1418-1429, December.
    4. Calcagno, Vincent & de Mazancourt, Claire, 2010. "glmulti: An R Package for Easy Automated Model Selection with (Generalized) Linear Models," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 34(i12).
    5. Yue, Mu & Li, Jialiang & Cheng, Ming-Yen, 2019. "Two-step sparse boosting for high-dimensional longitudinal data with varying coefficients," Computational Statistics & Data Analysis, Elsevier, vol. 131(C), pages 222-234.
    6. Ruben Dezeure & Peter Bühlmann & Cun-Hui Zhang, 2017. "High-dimensional simultaneous inference with the bootstrap," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 26(4), pages 685-719, December.
    7. Zhang, Yiyun & Li, Runze & Tsai, Chih-Ling, 2010. "Regularization Parameter Selections via Generalized Information Criterion," Journal of the American Statistical Association, American Statistical Association, vol. 105(489), pages 312-323.
    8. Chatterjee, A. & Lahiri, S. N., 2011. "Bootstrapping Lasso Estimators," Journal of the American Statistical Association, American Statistical Association, vol. 106(494), pages 608-625.
    9. Xianyang Zhang & Guang Cheng, 2017. "Simultaneous Inference for High-Dimensional Linear Models," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 112(518), pages 757-768, April.
    10. Peter R. Hansen & Asger Lunde & James M. Nason, 2011. "The Model Confidence Set," Econometrica, Econometric Society, vol. 79(2), pages 453-497, March.
    11. Rong Ma & T. Tony Cai & Hongzhe Li, 2021. "Global and Simultaneous Hypothesis Testing for High-Dimensional Logistic Regression Models," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 116(534), pages 984-998, April.
    12. Mee Young Park & Trevor Hastie, 2007. "L1‐regularization path algorithm for generalized linear models," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 69(4), pages 659-677, September.
    13. Yang Li & Yuetian Luo & Davide Ferrari & Xiaonan Hu & Yichen Qin, 2019. "Model confidence bounds for variable selection," Biometrics, The International Biometric Society, vol. 75(2), pages 392-403, June.
    14. Ryan J. Tibshirani & Jonathan Taylor & Richard Lockhart & Robert Tibshirani, 2016. "Exact Post-Selection Inference for Sequential Regression Procedures," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 111(514), pages 600-620, April.
    15. Fan J. & Li R., 2001. "Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties," Journal of the American Statistical Association, American Statistical Association, vol. 96, pages 1348-1360, December.
    16. Cun-Hui Zhang & Stephanie S. Zhang, 2014. "Confidence intervals for low dimensional parameters in high dimensional linear models," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 76(1), pages 217-242, January.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Xiaorui Zhu & Yichen Qin & Peng Wang, 2023. "Sparsified Simultaneous Confidence Intervals for High-Dimensional Linear Models," Papers 2307.07574, arXiv.org, revised Jan 2025.
    2. Horowitz, Joel L. & Rafi, Ahnaf, 2025. "Bootstrap based asymptotic refinements for high-dimensional nonlinear models," Journal of Econometrics, Elsevier, vol. 249(PB).
    3. T. Tony Cai & Zijian Guo & Yin Xia, 2023. "Statistical inference and large-scale multiple testing for high-dimensional regression models," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 32(4), pages 1135-1171, December.
    4. Qin, Yichen & Wang, Linna & Li, Yang & Li, Rong, 2023. "Visualization and assessment of model selection uncertainty," Computational Statistics & Data Analysis, Elsevier, vol. 178(C).
    5. Jingxuan Luo & Lili Yue & Gaorong Li, 2023. "Overview of High-Dimensional Measurement Error Regression Models," Mathematics, MDPI, vol. 11(14), pages 1-22, July.
    6. Luo, Shikai & Yang, Ying & Shi, Chengchun & Yao, Fang & Ye, Jieping & Zhu, Hongtu, 2024. "Policy evaluation for temporal and/or spatial dependent experiments," LSE Research Online Documents on Economics 122741, London School of Economics and Political Science, LSE Library.
    7. Dumitrescu, Elena & Hué, Sullivan & Hurlin, Christophe & Tokpavi, Sessi, 2022. "Machine learning for credit scoring: Improving logistic regression with non-linear decision-tree effects," European Journal of Operational Research, Elsevier, vol. 297(3), pages 1178-1192.
    8. Claude Renaux & Laura Buzdugan & Markus Kalisch & Peter Bühlmann, 2020. "Hierarchical inference for genome-wide association studies: a view on methodology with software," Computational Statistics, Springer, vol. 35(1), pages 1-40, March.
    9. Tanin Sirimongkolkasem & Reza Drikvandi, 2019. "On Regularisation Methods for Analysis of High Dimensional Data," Annals of Data Science, Springer, vol. 6(4), pages 737-763, December.
    10. Hongwei Shi & Weichao Yang & Bowen Sun & Xu Guo, 2025. "Tests for high-dimensional partially linear regression models," Statistical Papers, Springer, vol. 66(3), pages 1-23, April.
    11. Lu Xia & Bin Nan & Yi Li, 2023. "Debiased lasso for generalized linear models with a diverging number of covariates," Biometrics, The International Biometric Society, vol. 79(1), pages 344-357, March.
    12. Faguang Wen & Jiming Jiang & Yihui Luan, 2024. "Model Selection Path and Construction of Model Confidence Set under High-Dimensional Variables," Mathematics, MDPI, vol. 12(5), pages 1-21, February.
    13. Li, Xiang & Li, Yu-Ning & Zhang, Li-Xin & Zhao, Jun, 2024. "Inference for high-dimensional linear expectile regression with de-biasing method," Computational Statistics & Data Analysis, Elsevier, vol. 198(C).
    14. Panxu Yuan & Yinfei Kong & Gaorong Li, 2024. "FDR control and power analysis for high-dimensional logistic regression via StabKoff," Statistical Papers, Springer, vol. 65(5), pages 2719-2749, July.
    15. Alexandre Belloni & Victor Chernozhukov & Denis Chetverikov & Christian Hansen & Kengo Kato, 2018. "High-dimensional econometrics and regularized GMM," CeMMAP working papers CWP35/18, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    16. Naimoli, Antonio, 2022. "Modelling the persistence of Covid-19 positivity rate in Italy," Socio-Economic Planning Sciences, Elsevier, vol. 82(PA).
    17. Camila Epprecht & Dominique Guegan & Álvaro Veiga & Joel Correa da Rosa, 2017. "Variable selection and forecasting via automated methods for linear models: LASSO/adaLASSO and Autometrics," Post-Print halshs-00917797, HAL.
    18. Toshio Honda, 2021. "The de-biased group Lasso estimation for varying coefficient models," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 73(1), pages 3-29, February.
    19. Bartosz Uniejewski, 2024. "Regularization for electricity price forecasting," Operations Research and Decisions, Wroclaw University of Science and Technology, Faculty of Management, vol. 34(3), pages 267-286.
    20. Hansen, Christian & Liao, Yuan, 2019. "The Factor-Lasso And K-Step Bootstrap Approach For Inference In High-Dimensional Economic Applications," Econometric Theory, Cambridge University Press, vol. 35(3), pages 465-509, June.

    More about this item

    Keywords

    ;
    ;
    ;
    ;

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:metrik:v:88:y:2025:i:5:d:10.1007_s00184-024-00975-z. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.