IDEAS home Printed from https://ideas.repec.org/a/eee/ecosta/v34y2025icp14-31.html

A cluster plugin method for selecting the GLM lasso tuning parameters in models for unbalanced panel data

Author

Listed:
  • Drukker, David M.
  • Liu, Di

Abstract

New methods are discussed for estimating the population-averaged (PA) coefficients of some variables of interest in a sparse, high-dimensional generalized linear model (GLM) in an unbalanced panel-data setting with random effects. A cluster plugin for GLM lassos with unbalanced panel data is proposed. It is proven that a lasso that uses the new cluster-plugin method can outperform a lasso using the cross-sectional plugin, when the covariates have non-zero within-panel covariances. The proposed cluster plugin for GLMs extends the literature on Neyman-orthogonal moment conditions and provides estimators for the PA coefficients in sparse, high-dimensional logit, Poisson, and linear models when the data come from unbalanced panels. The results of the Monte Carlo simulations show that the implemented estimators perform well in finite samples. The simulations also show that a GLM lasso using the proposed cluster-plugin method produces more accurate covariate selection than a GLM lasso using the cross-sectional plugin method, when the covariates have non-zero within-panel covariances. Easy-to-use Stata commands are available for the proposed methods.

Suggested Citation

  • Drukker, David M. & Liu, Di, 2025. "A cluster plugin method for selecting the GLM lasso tuning parameters in models for unbalanced panel data," Econometrics and Statistics, Elsevier, vol. 34(C), pages 14-31.
  • Handle: RePEc:eee:ecosta:v:34:y:2025:i:c:p:14-31
    DOI: 10.1016/j.ecosta.2022.02.006
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S2452306222000132
    Download Restriction: Full text for ScienceDirect subscribers only. Contains open access articles

    File URL: https://libkey.io/10.1016/j.ecosta.2022.02.006?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to

    for a different version of it.

    References listed on IDEAS

    as
    1. Alexandre Belloni & Victor Chernozhukov & Ying Wei, 2016. "Post-Selection Inference for Generalized Linear Models With Many Controls," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 34(4), pages 606-619, October.
    2. Victor Chernozhukov & Christian Hansen & Martin Spindler, 2015. "Valid Post-Selection and Post-Regularization Inference: An Elementary, General Approach," Annual Review of Economics, Annual Reviews, vol. 7(1), pages 649-688, August.
    3. Leeb, Hannes & Potscher, Benedikt M., 2008. "Sparse estimators and the oracle property, or the return of Hodges' estimator," Journal of Econometrics, Elsevier, vol. 142(1), pages 201-211, January.
    4. Matias D Cattaneo & Michael Jansson & Xinwei Ma, 2019. "Two-Step Estimation and Inference with Possibly Many Included Covariates," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 86(3), pages 1095-1122.
    5. Victor Chernozhukov & Denis Chetverikov & Mert Demirer & Esther Duflo & Christian Hansen & Whitney Newey & James Robins, 2018. "Double/debiased machine learning for treatment and structural parameters," Econometrics Journal, Royal Economic Society, vol. 21(1), pages 1-68, February.
    6. A. Belloni & D. Chen & V. Chernozhukov & C. Hansen, 2012. "Sparse Models and Methods for Optimal Instruments With an Application to Eminent Domain," Econometrica, Econometric Society, vol. 80(6), pages 2369-2429, November.
    7. Alexandre Belloni & Victor Chernozhukov & Christian Hansen, 2014. "Inference on Treatment Effects after Selection among High-Dimensional Controlsâ€," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 81(2), pages 608-650.
    8. Leeb, Hannes & Pötscher, Benedikt M., 2008. "Can One Estimate The Unconditional Distribution Of Post-Model-Selection Estimators?," Econometric Theory, Cambridge University Press, vol. 24(2), pages 338-376, April.
    9. Alexandre Belloni & Victor Chernozhukov & Christian Hansen & Damian Kozbur, 2016. "Inference in High-Dimensional Panel Models With an Application to Gun Control," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 34(4), pages 590-605, October.
    10. Pötscher, Benedikt M. & Leeb, Hannes, 2009. "On the distribution of penalized maximum likelihood estimators: The LASSO, SCAD, and thresholding," Journal of Multivariate Analysis, Elsevier, vol. 100(9), pages 2065-2082, October.
    11. Gary Chamberlain, 1980. "Analysis of Covariance with Qualitative Data," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 47(1), pages 225-238.
    12. Moulton, Brent R., 1986. "Random group effects and the precision of regression estimates," Journal of Econometrics, Elsevier, vol. 32(3), pages 385-397, August.
    13. Moulton, Brent R, 1990. "An Illustration of a Pitfall in Estimating the Effects of Aggregate Variables on Micro Unit," The Review of Economics and Statistics, MIT Press, vol. 72(2), pages 334-338, May.
    14. Jeffrey M Wooldridge, 2010. "Econometric Analysis of Cross Section and Panel Data," MIT Press Books, The MIT Press, edition 2, volume 1, number 0262232588, December.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Farrell, Max H., 2015. "Robust inference on average treatment effects with possibly more covariates than observations," Journal of Econometrics, Elsevier, vol. 189(1), pages 1-23.
    2. Alexandre Belloni & Victor Chernozhukov & Denis Chetverikov & Christian Hansen & Kengo Kato, 2018. "High-Dimensional Econometrics and Regularized GMM," Papers 1806.01888, arXiv.org, revised Jun 2018.
    3. Su, Liangjun & Ura, Takuya & Zhang, Yichong, 2019. "Non-separable models with high-dimensional data," Journal of Econometrics, Elsevier, vol. 212(2), pages 646-677.
    4. Kaspar Wuthrich & Ying Zhu, 2019. "Omitted variable bias of Lasso-based inference methods: A finite sample analysis," Papers 1903.08704, arXiv.org, revised Sep 2021.
    5. Adamek, Robert & Smeekes, Stephan & Wilms, Ines, 2023. "Lasso inference for high-dimensional time series," Journal of Econometrics, Elsevier, vol. 235(2), pages 1114-1143.
    6. Harold D. Chiang, 2018. "Many Average Partial Effects: with An Application to Text Regression," Papers 1812.09397, arXiv.org, revised Jan 2022.
    7. Masayuki Hirukawa & Di Liu & Irina Murtazashvili & Artem Prokhorov, 2024. "DS-HECK: double-lasso estimation of Heckman selection model," Advanced Studies in Theoretical and Applied Econometrics, in: Subal C. Kumbhakar & Robin C. Sickles & Hung-Jen Wang (ed.), Advances in Applied Econometrics, pages 711-739, Springer.
    8. Victor Chernozhukov & Denis Chetverikov & Mert Demirer & Esther Duflo & Christian Hansen & Whitney K. Newey, 2016. "Double machine learning for treatment and causal parameters," CeMMAP working papers 49/16, Institute for Fiscal Studies.
    9. Neng-Chieh Chang, 2020. "The Mode Treatment Effect," Papers 2007.11606, arXiv.org.
    10. Caballero, Julián & Upper, Christian, 2026. "What happens to emerging market economies when US yields go up?," Journal of International Money and Finance, Elsevier, vol. 160(C).
    11. Helmut Wasserbacher & Martin Spindler, 2022. "Machine learning for financial forecasting, planning and analysis: recent developments and pitfalls," Digital Finance, Springer, vol. 4(1), pages 63-88, March.
    12. Helmut Wasserbacher & Martin Spindler, 2021. "Machine Learning for Financial Forecasting, Planning and Analysis: Recent Developments and Pitfalls," Papers 2107.04851, arXiv.org.
    13. Kaicheng Chen, 2025. "Inference in High-Dimensional Panel Models: Two-Way Dependence and Unobserved Heterogeneity," Papers 2504.18772, arXiv.org, revised Dec 2025.
    14. Christian Hansen & Damian Kozbur & Sanjog Misra, 2016. "Targeted undersmoothing," ECON - Working Papers 282, Department of Economics - University of Zurich, revised Apr 2018.
    15. Aristide Houndetoungan & Abdoul Haki Maoude, 2024. "Inference for Two-Stage Extremum Estimators," Papers 2402.05030, arXiv.org, revised Nov 2024.
    16. Achim Ahrens & Alessandra Stampi‐Bombelli & Selina Kurer & Dominik Hangartner, 2024. "Optimal multi‐action treatment allocation: A two‐phase field experiment to boost immigrant naturalization," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 39(7), pages 1379-1395, November.
    17. Anders Bredahl Kock & Haihan Tang, 2014. "Inference in High-dimensional Dynamic Panel Data Models," CREATES Research Papers 2014-58, Department of Economics and Business Economics, Aarhus University.
    18. Neng-Chieh Chang, 2018. "Semiparametric Difference-in-Differences with Potentially Many Control Variables," Papers 1812.10846, arXiv.org, revised Jan 2019.
    19. Gozgor, Giray & Li, Jing & Saleem, Irfan & Shinwari, Riazullah, 2025. "The impact of women's political empowerment on renewable energy demand: Evidence from OECD countries," Energy Economics, Elsevier, vol. 141(C).
    20. Alexandre Belloni & Victor Chernozhukov & Ying Wei, 2016. "Post-Selection Inference for Generalized Linear Models With Many Controls," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 34(4), pages 606-619, October.

    More about this item

    Keywords

    ;
    ;
    ;
    ;

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:ecosta:v:34:y:2025:i:c:p:14-31. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: https://www.journals.elsevier.com/econometrics-and-statistics .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.