IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2104.14371.html
   My bibliography  Save this paper

Generalized Linear Models with Structured Sparsity Estimators

Author

Listed:
  • Mehmet Caner

Abstract

In this paper, we introduce structured sparsity estimators in Generalized Linear Models. Structured sparsity estimators in the least squares loss are introduced by Stucky and van de Geer (2018) recently for fixed design and normal errors. We extend their results to debiased structured sparsity estimators with Generalized Linear Model based loss. Structured sparsity estimation means penalized loss functions with a possible sparsity structure used in the chosen norm. These include weighted group lasso, lasso and norms generated from convex cones. The significant difficulty is that it is not clear how to prove two oracle inequalities. The first one is for the initial penalized Generalized Linear Model estimator. Since it is not clear how a particular feasible-weighted nodewise regression may fit in an oracle inequality for penalized Generalized Linear Model, we need a second oracle inequality to get oracle bounds for the approximate inverse for the sample estimate of second-order partial derivative of Generalized Linear Model. Our contributions are fivefold: 1. We generalize the existing oracle inequality results in penalized Generalized Linear Models by proving the underlying conditions rather than assuming them. One of the key issues is the proof of a sample one-point margin condition and its use in an oracle inequality. 2. Our results cover even non sub-Gaussian errors and regressors. 3. We provide a feasible weighted nodewise regression proof which generalizes the results in the literature from a simple l_1 norm usage to norms generated from convex cones. 4. We realize that norms used in feasible nodewise regression proofs should be weaker or equal to the norms in penalized Generalized Linear Model loss. 5. We can debias the first step estimator via getting an approximate inverse of the singular-sample second order partial derivative of Generalized Linear Model loss.

Suggested Citation

  • Mehmet Caner, 2021. "Generalized Linear Models with Structured Sparsity Estimators," Papers 2104.14371, arXiv.org.
  • Handle: RePEc:arx:papers:2104.14371
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2104.14371
    File Function: Latest version
    Download Restriction: no
    ---><---

    Other versions of this item:

    References listed on IDEAS

    as
    1. Lukas Meier & Sara Van De Geer & Peter Bühlmann, 2008. "The group lasso for logistic regression," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 70(1), pages 53-71, February.
    2. Caner, Mehmet & Kock, Anders Bredahl, 2018. "Asymptotically honest confidence regions for high dimensional parameters by the desparsified conservative Lasso," Journal of Econometrics, Elsevier, vol. 203(1), pages 143-168.
    3. Alexandre Belloni & Christian Hansen & Whitney Newey, 2017. "Simultaneous Confidence Intervals for High-dimensional Linear Models with Many Endogenous Variables," Papers 1712.08102, arXiv.org, revised Aug 2019.
    4. Shi, Chengchun & Song, Rui & Chen, Zhao & Li, Runze, 2019. "Linear hypothesis testing for high dimensional generalized linear models," LSE Research Online Documents on Economics 102108, London School of Economics and Political Science, LSE Library.
    5. Farrell, Max H., 2015. "Robust inference on average treatment effects with possibly more covariates than observations," Journal of Econometrics, Elsevier, vol. 189(1), pages 1-23.
    6. Alexandre Belloni & Victor Chernozhukov & Christian Hansen, 2014. "Inference on Treatment Effects after Selection among High-Dimensional Controlsâ€," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 81(2), pages 608-650.
    7. Kock, Anders Bredahl, 2016. "Oracle inequalities, variable selection and uniform inference in high-dimensional correlated random effects panel data models," Journal of Econometrics, Elsevier, vol. 195(1), pages 71-85.
    8. Sara Geer, 2014. "Weakly decomposable regularization penalties and structured sparsity," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 41(1), pages 72-86, March.
    9. Chiang, Harold D. & Sasaki, Yuya, 2019. "Causal inference by quantile regression kink designs," Journal of Econometrics, Elsevier, vol. 210(2), pages 405-433.
    10. Ming Yuan & Yi Lin, 2006. "Model selection and estimation in regression with grouped variables," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 68(1), pages 49-67, February.
    11. Cai, Tony & Liu, Weidong & Luo, Xi, 2011. "A Constrained â„“1 Minimization Approach to Sparse Precision Matrix Estimation," Journal of the American Statistical Association, American Statistical Association, vol. 106(494), pages 594-607.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Mehmet Caner & Kfir Eliaz, 2021. "Shoiuld Humans Lie to Machines: The Incentive Compatibility of Lasso and General Weighted Lasso," Papers 2101.01144, arXiv.org, revised Sep 2021.
    2. Alexandre Belloni & Victor Chernozhukov & Denis Chetverikov & Christian Hansen & Kengo Kato, 2018. "High-dimensional econometrics and regularized GMM," CeMMAP working papers CWP35/18, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    3. Guo, Xu & Li, Runze & Liu, Jingyuan & Zeng, Mudong, 2023. "Statistical inference for linear mediation models with high-dimensional mediators and application to studying stock reaction to COVID-19 pandemic," Journal of Econometrics, Elsevier, vol. 235(1), pages 166-179.
    4. Liu, Jianyu & Yu, Guan & Liu, Yufeng, 2019. "Graph-based sparse linear discriminant analysis for high-dimensional classification," Journal of Multivariate Analysis, Elsevier, vol. 171(C), pages 250-269.
    5. Pei Wang & Shunjie Chen & Sijia Yang, 2022. "Recent Advances on Penalized Regression Models for Biological Data," Mathematics, MDPI, vol. 10(19), pages 1-24, October.
    6. Khai X. Chiong & Hyungsik Roger Moon, 2017. "Estimation of Graphical Models using the $L_{1,2}$ Norm," Papers 1709.10038, arXiv.org, revised Oct 2017.
    7. Victor Chernozhukov & Chen Huang & Weining Wang, 2021. "Uniform Inference on High-dimensional Spatial Panel Networks," Papers 2105.07424, arXiv.org, revised Sep 2023.
    8. Kaspar Wuthrich & Ying Zhu, 2019. "Omitted variable bias of Lasso-based inference methods: A finite sample analysis," Papers 1903.08704, arXiv.org, revised Sep 2021.
    9. Alexandre Belloni & Mingli Chen & Victor Chernozhukov, 2016. "Quantile Graphical Models: Prediction and Conditional Independence with Applications to Systemic Risk," Papers 1607.00286, arXiv.org, revised Oct 2019.
    10. Tutz, Gerhard & Pößnecker, Wolfgang & Uhlmann, Lorenz, 2015. "Variable selection in general multinomial logit models," Computational Statistics & Data Analysis, Elsevier, vol. 82(C), pages 207-222.
    11. Kyle Colangelo & Ying-Ying Lee, 2019. "Double debiased machine learning nonparametric inference with continuous treatments," CeMMAP working papers CWP72/19, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    12. Sant’Anna, Pedro H.C. & Zhao, Jun, 2020. "Doubly robust difference-in-differences estimators," Journal of Econometrics, Elsevier, vol. 219(1), pages 101-122.
    13. Ye, Ya-Fen & Shao, Yuan-Hai & Deng, Nai-Yang & Li, Chun-Na & Hua, Xiang-Yu, 2017. "Robust Lp-norm least squares support vector regression with feature selection," Applied Mathematics and Computation, Elsevier, vol. 305(C), pages 32-52.
    14. Michael C. Knaus, 2021. "A double machine learning approach to estimate the effects of musical practice on student’s skills," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 184(1), pages 282-300, January.
    15. Michael C Knaus & Michael Lechner & Anthony Strittmatter, 2021. "Machine learning estimation of heterogeneous causal effects: Empirical Monte Carlo evidence," The Econometrics Journal, Royal Economic Society, vol. 24(1), pages 134-161.
    16. Kyle Colangelo & Ying-Ying Lee, 2019. "Double debiased machine learning nonparametric inference with continuous treatments," CeMMAP working papers CWP54/19, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    17. Zhengyuan Zhou & Susan Athey & Stefan Wager, 2023. "Offline Multi-Action Policy Learning: Generalization and Optimization," Operations Research, INFORMS, vol. 71(1), pages 148-183, January.
    18. Toshio Honda, 2021. "The de-biased group Lasso estimation for varying coefficient models," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 73(1), pages 3-29, February.
    19. Dong Liu & Changwei Zhao & Yong He & Lei Liu & Ying Guo & Xinsheng Zhang, 2023. "Simultaneous cluster structure learning and estimation of heterogeneous graphs for matrix‐variate fMRI data," Biometrics, The International Biometric Society, vol. 79(3), pages 2246-2259, September.
    20. Bilin Zeng & Xuerong Meggie Wen & Lixing Zhu, 2017. "A link-free sparse group variable selection method for single-index model," Journal of Applied Statistics, Taylor & Francis Journals, vol. 44(13), pages 2388-2400, October.

    More about this item

    JEL classification:

    • C18 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General - - - Methodolical Issues: General
    • C21 - Mathematical and Quantitative Methods - - Single Equation Models; Single Variables - - - Cross-Sectional Models; Spatial Models; Treatment Effect Models
    • C55 - Mathematical and Quantitative Methods - - Econometric Modeling - - - Large Data Sets: Modeling and Analysis

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2104.14371. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.