IDEAS home Printed from https://ideas.repec.org/p/ehl/lserod/129540.html
   My bibliography  Save this paper

Simplex constrained sparse optimization via tail screening

Author

Listed:
  • Chen, Peng
  • Zhu, Jin
  • Zhu, Junxian
  • Wang, Xueqin

Abstract

We consider the probabilistic simplex-constrained sparse recovery problem. The commonly used Lasso-type penalty for promoting sparsity is ineffective in this context since it is a constant within the simplex. Despite this challenge, fortunately, simplex constraint itself brings a self-regularization property, i.e., the empirical risk minimizer without any sparsity-promoting procedure obtains the usual Lasso-type estimation error. Moreover, we analyze the iterates of a projected gradient descent method and show its convergence to the ground truth sparse solution in the geometric rate until a satisfied statistical precision is attained. Although the estimation error is statistically optimal, the resulting solution is usually more dense than the sparse ground truth. To further sparsify the iterates, we propose a method called PERMITS via embedding a tail screening procedure, i.e., identifying negligible components and discarding them during iterations, into the projected gradient descent method. Furthermore, we combine tail screening and the special information criterion to balance the trade-off between fitness and complexity. Theoretically, the proposed PERMITS method can exactly recover the ground truth support set under mild conditions and thus obtain the oracle property. We demonstrate the statistical and computational efficiency of PERMITS with both synthetic and real data. The implementation of the proposed method can be found in https://github.com/abess-team/PERMITS.

Suggested Citation

  • Chen, Peng & Zhu, Jin & Zhu, Junxian & Wang, Xueqin, 2025. "Simplex constrained sparse optimization via tail screening," LSE Research Online Documents on Economics 129540, London School of Economics and Political Science, LSE Library.
  • Handle: RePEc:ehl:lserod:129540
    as

    Download full text from publisher

    File URL: http://eprints.lse.ac.uk/129540/
    File Function: Open access version.
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Lukas Meier & Sara Van De Geer & Peter Bühlmann, 2008. "The group lasso for logistic regression," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 70(1), pages 53-71, February.
    2. Jianqing Fan & Yongyi Guo & Kaizheng Wang, 2023. "Communication-Efficient Accurate Statistical Estimation," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 118(542), pages 1000-1010, April.
    3. NESTEROV, Yurii, 2013. "Gradient methods for minimizing composite functions," LIDAM Reprints CORE 2510, Université catholique de Louvain, Center for Operations Research and Econometrics (CORE).
    4. Yanhang Zhang & Junxian Zhu & Jin Zhu & Xueqin Wang, 2023. "A Splicing Approach to Best Subset of Groups Selection," INFORMS Journal on Computing, INFORMS, vol. 35(1), pages 104-119, January.
    5. Robert Tibshirani & Michael Saunders & Saharon Rosset & Ji Zhu & Keith Knight, 2005. "Sparsity and smoothness via the fused lasso," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 67(1), pages 91-108, February.
    6. Yu Zheng & Timothy M. Hospedales & Yongxin Yang, 2018. "Diversity and Sparsity: A New Perspective on Index Tracking," Papers 1809.01989, arXiv.org, revised Feb 2020.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Tutz, Gerhard & Pößnecker, Wolfgang & Uhlmann, Lorenz, 2015. "Variable selection in general multinomial logit models," Computational Statistics & Data Analysis, Elsevier, vol. 82(C), pages 207-222.
    2. Luu, Tung Duy & Fadili, Jalal & Chesneau, Christophe, 2019. "PAC-Bayesian risk bounds for group-analysis sparse regression by exponential weighting," Journal of Multivariate Analysis, Elsevier, vol. 171(C), pages 209-233.
    3. Pei Wang & Shunjie Chen & Sijia Yang, 2022. "Recent Advances on Penalized Regression Models for Biological Data," Mathematics, MDPI, vol. 10(19), pages 1-24, October.
    4. Christian Kanzow & Theresa Lechner, 2021. "Globalized inexact proximal Newton-type methods for nonconvex composite functions," Computational Optimization and Applications, Springer, vol. 78(2), pages 377-410, March.
    5. Gerhard Tutz & Jan Gertheiss, 2014. "Rating Scales as Predictors—The Old Question of Scale Level and Some Answers," Psychometrika, Springer;The Psychometric Society, vol. 79(3), pages 357-376, July.
    6. Xiaoping Liu & Xiao-Bai Li & Sumit Sarkar, 2023. "Cost-Restricted Feature Selection for Data Acquisition," Management Science, INFORMS, vol. 69(7), pages 3976-3992, July.
    7. Minh Pham & Xiaodong Lin & Andrzej Ruszczyński & Yu Du, 2021. "An outer–inner linearization method for non-convex and nondifferentiable composite regularization problems," Journal of Global Optimization, Springer, vol. 81(1), pages 179-202, September.
    8. Bang, Sungwan & Jhun, Myoungshic, 2012. "Simultaneous estimation and factor selection in quantile regression via adaptive sup-norm regularization," Computational Statistics & Data Analysis, Elsevier, vol. 56(4), pages 813-826.
    9. Yang, Hu & Yi, Danhui, 2015. "Studies of the adaptive network-constrained linear regression and its application," Computational Statistics & Data Analysis, Elsevier, vol. 92(C), pages 40-52.
    10. Baiguo An & Beibei Zhang, 2020. "Logistic regression with image covariates via the combination of L1 and Sobolev regularizations," PLOS ONE, Public Library of Science, vol. 15(6), pages 1-18, June.
    11. Chen, Shunjie & Yang, Sijia & Wang, Pei & Xue, Liugen, 2023. "Two-stage penalized algorithms via integrating prior information improve gene selection from omics data," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 628(C).
    12. Jian Huang & Yuling Jiao & Lican Kang & Jin Liu & Yanyan Liu & Xiliang Lu, 2022. "GSDAR: a fast Newton algorithm for $$\ell _0$$ ℓ 0 regularized generalized linear models with statistical guarantee," Computational Statistics, Springer, vol. 37(1), pages 507-533, March.
    13. Hess, Wolfgang & Persson, Maria & Rubenbauer, Stephanie & Gertheiss, Jan, 2013. "Using Lasso-Type Penalties to Model Time-Varying Covariate Effects in Panel Data Regressions - A Novel Approach Illustrated by the 'Death of Distance' in International Trade," Working Papers 2013:5, Lund University, Department of Economics.
    14. Zhao, Xun & Tang, Lu & Zhang, Weijia & Zhou, Ling, 2025. "Subgroup learning for multiple mixed-type outcomes with block-structured covariates," Computational Statistics & Data Analysis, Elsevier, vol. 204(C).
    15. Moindjié, Issam-Ali & Preda, Cristian & Dabo-Niang, Sophie, 2025. "Fusion regression methods with repeated functional data," Computational Statistics & Data Analysis, Elsevier, vol. 203(C).
    16. Samuel Vaiter & Charles Deledalle & Jalal Fadili & Gabriel Peyré & Charles Dossal, 2017. "The degrees of freedom of partly smooth regularizers," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 69(4), pages 791-832, August.
    17. Diego Vidaurre & Concha Bielza & Pedro Larrañaga, 2013. "A Survey of L1 Regression," International Statistical Review, International Statistical Institute, vol. 81(3), pages 361-387, December.
    18. Xiaoya Zhang & Wei Peng & Hui Zhang, 2022. "Inertial proximal incremental aggregated gradient method with linear convergence guarantees," Mathematical Methods of Operations Research, Springer;Gesellschaft für Operations Research (GOR);Nederlands Genootschap voor Besliskunde (NGB), vol. 96(2), pages 187-213, October.
    19. Yang, Yuan & McMahan, Christopher S. & Wang, Yu-Bo & Ouyang, Yuyuan, 2024. "Estimation of l0 norm penalized models: A statistical treatment," Computational Statistics & Data Analysis, Elsevier, vol. 192(C).
    20. Mkhadri, Abdallah & Ouhourane, Mohamed, 2013. "An extended variable inclusion and shrinkage algorithm for correlated variables," Computational Statistics & Data Analysis, Elsevier, vol. 57(1), pages 631-644.

    More about this item

    Keywords

    ;
    ;
    ;
    ;
    ;

    JEL classification:

    • C1 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:ehl:lserod:129540. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: LSERO Manager (email available below). General contact details of provider: https://edirc.repec.org/data/lsepsuk.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.