IDEAS home Printed from https://ideas.repec.org/a/eee/csdana/v208y2025ics0167947325000301.html
   My bibliography  Save this article

Communication-efficient estimation and inference for high-dimensional longitudinal data

Author

Listed:
  • Li, Xing
  • Peng, Yanjing
  • Wang, Lei

Abstract

With the rapid growth in modern science and technology, distributed longitudinal data have drawn attention in a wide range of aspects. Realizing that not all effects of covariates are our parameters of interest, we focus on the distributed estimation and statistical inference of a pre-conceived low-dimensional parameter in the high-dimensional longitudinal GLMs with canonical links. To mitigate the impact of high-dimensional nuisance parameters and incorporate the within-subject correlation simultaneously, a decorrelated quadratic inference function is proposed for enhancing the estimation efficiency. Two communication-efficient surrogate decorrelated score estimators based on multi-round iterative algorithms are proposed. The error bounds and limiting distribution of the proposed estimators are established and extensive numerical experiments demonstrate the effectiveness of our method. An application to the National Longitudinal Survey of Youth Dataset is also presented.

Suggested Citation

  • Li, Xing & Peng, Yanjing & Wang, Lei, 2025. "Communication-efficient estimation and inference for high-dimensional longitudinal data," Computational Statistics & Data Analysis, Elsevier, vol. 208(C).
  • Handle: RePEc:eee:csdana:v:208:y:2025:i:c:s0167947325000301
    DOI: 10.1016/j.csda.2025.108154
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0167947325000301
    Download Restriction: Full text for ScienceDirect subscribers only.

    File URL: https://libkey.io/10.1016/j.csda.2025.108154?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to

    for a different version of it.

    References listed on IDEAS

    as
    1. Suojin Wang & Lianfen Qian & Raymond J. Carroll, 2010. "Generalized empirical likelihood methods for analyzing longitudinal data," Biometrika, Biometrika Trust, vol. 97(1), pages 79-93.
    2. Wang, Kangning & Li, Shaomin & Zhang, Benle, 2021. "Robust communication-efficient distributed composite quantile regression and variable selection for massive data," Computational Statistics & Data Analysis, Elsevier, vol. 161(C).
    3. Xiao Ni & Daowen Zhang & Hao Helen Zhang, 2010. "Variable Selection for Semiparametric Mixed Models in Longitudinal Studies," Biometrics, The International Biometric Society, vol. 66(1), pages 79-88, March.
    4. Jianhui Zhou & Annie Qu, 2012. "Informative Estimation and Selection of Correlation Structure for Longitudinal Data," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 107(498), pages 701-710, June.
    5. Lan Wang & Annie Qu, 2009. "Consistent model selection and data‐driven smooth tests for longitudinal data in the estimating equations approach," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 71(1), pages 177-190, January.
    6. Xue, Lan & Qu, Annie & Zhou, Jianhui, 2010. "Consistent Model Selection for Marginal Generalized Additive Model for Correlated Data," Journal of the American Statistical Association, American Statistical Association, vol. 105(492), pages 1518-1530.
    7. Li, Mengyan & Li, Runze & Ma, Yanyuan, 2021. "Inference in high dimensional linear measurement error models," Journal of Multivariate Analysis, Elsevier, vol. 184(C).
    8. Michael I. Jordan & Jason D. Lee & Yun Yang, 2019. "Communication-Efficient Distributed Statistical Inference," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 114(526), pages 668-681, April.
    9. Liugen Xue & Lixing Zhu, 2007. "Empirical Likelihood Semiparametric Regression Analysis for Longitudinal Data," Biometrika, Biometrika Trust, vol. 94(4), pages 921-937.
    10. Rui Duan & Yang Ning & Yong Chen, 2022. "Heterogeneity-aware and communication-efficient distributed statistical inference [Privacy, confidentiality, and electronic medical records]," Biometrika, Biometrika Trust, vol. 109(1), pages 67-83.
    11. Ethan X. Fang & Yang Ning & Han Liu, 2017. "Testing and confidence intervals for high dimensional proportional hazards models," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 79(5), pages 1415-1437, November.
    12. Cun-Hui Zhang & Stephanie S. Zhang, 2014. "Confidence intervals for low dimensional parameters in high dimensional linear models," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 76(1), pages 217-242, January.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Zhaohan Hou & Wei Ma & Lei Wang, 2023. "Sparse and debiased lasso estimation and inference for high-dimensional composite quantile regression with distributed data," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 32(4), pages 1230-1250, December.
    2. Xingcai Zhou & Dehan Kong & Matthew Stephen Pietrosanu & Linglong Kong & Rohana J. Karunamuni, 2024. "Empirical likelihood M‐estimation for the varying‐coefficient model with functional response," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 51(3), pages 1357-1387, September.
    3. Qian, Lianfen & Wang, Suojin, 2017. "Subject-wise empirical likelihood inference in partial linear models for longitudinal data," Computational Statistics & Data Analysis, Elsevier, vol. 111(C), pages 77-87.
    4. Wang, Kangning & Li, Shaomin & Sun, Xiaofei & Lin, Lu, 2019. "Modal regression statistical inference for longitudinal data semivarying coefficient models: Generalized estimating equations, empirical likelihood and variable selection," Computational Statistics & Data Analysis, Elsevier, vol. 133(C), pages 257-276.
    5. T. Tony Cai & Zijian Guo & Yin Xia, 2023. "Statistical inference and large-scale multiple testing for high-dimensional regression models," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 32(4), pages 1135-1171, December.
    6. Luo, Jiyu & Sun, Qiang & Zhou, Wen-Xin, 2022. "Distributed adaptive Huber regression," Computational Statistics & Data Analysis, Elsevier, vol. 169(C).
    7. T. Tony Cai & Zijian Guo & Yin Xia, 2023. "Rejoinder on: statistical inference and large-scale multiple testing for high-dimensional regression models," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 32(4), pages 1187-1194, December.
    8. Yang, Yiping & Li, Gaorong & Peng, Heng, 2014. "Empirical likelihood of varying coefficient errors-in-variables models with longitudinal data," Journal of Multivariate Analysis, Elsevier, vol. 127(C), pages 1-18.
    9. Yaohong Yang & Lei Wang, 2023. "Communication-efficient sparse composite quantile regression for distributed data," Metrika: International Journal for Theoretical and Applied Statistics, Springer, vol. 86(3), pages 261-283, April.
    10. Lei Wang & Wei Ma, 2021. "Improved empirical likelihood inference and variable selection for generalized linear models with longitudinal nonignorable dropouts," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 73(3), pages 623-647, June.
    11. Haixiang Zhang & Jian Huang & Liuquan Sun, 2022. "Projection‐based and cross‐validated estimation in high‐dimensional Cox model," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 49(1), pages 353-372, March.
    12. Lu Xia & Bin Nan & Yi Li, 2023. "Debiased lasso for generalized linear models with a diverging number of covariates," Biometrics, The International Biometric Society, vol. 79(1), pages 344-357, March.
    13. Peng, Yanjin & Wang, Lei, 2025. "Heterogeneity-aware transfer learning for high-dimensional linear regression models," Computational Statistics & Data Analysis, Elsevier, vol. 206(C).
    14. Xiaobo Wang & Jiayu Huang & Guosheng Yin & Jian Huang & Yuanshan Wu, 2023. "Double bias correction for high-dimensional sparse additive hazards regression with covariate measurement errors," Lifetime Data Analysis: An International Journal Devoted to Statistical Methods and Applications for Time-to-Event Data, Springer, vol. 29(1), pages 115-141, January.
    15. Li, Daoji & Pan, Jianxin, 2013. "Empirical likelihood for generalized linear models with longitudinal data," Journal of Multivariate Analysis, Elsevier, vol. 114(C), pages 63-73.
    16. Alexandre Belloni & Victor Chernozhukov & Kengo Kato, 2019. "Valid Post-Selection Inference in High-Dimensional Approximately Sparse Quantile Regression Models," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 114(526), pages 749-758, April.
    17. Chenchuan (Mark) Li & Ulrich K. Müller, 2021. "Linear regression with many controls of limited explanatory power," Quantitative Economics, Econometric Society, vol. 12(2), pages 405-442, May.
    18. Xiang, Pengcheng & Zhou, Ling & Tang, Lu, 2024. "Transfer learning via random forests: A one-shot federated approach," Computational Statistics & Data Analysis, Elsevier, vol. 197(C).
    19. Victor Chernozhukov & Whitney K. Newey & Victor Quintas-Martinez & Vasilis Syrgkanis, 2021. "Automatic Debiased Machine Learning via Riesz Regression," Papers 2104.14737, arXiv.org, revised Mar 2024.
    20. Guo, Xu & Li, Runze & Liu, Jingyuan & Zeng, Mudong, 2023. "Statistical inference for linear mediation models with high-dimensional mediators and application to studying stock reaction to COVID-19 pandemic," Journal of Econometrics, Elsevier, vol. 235(1), pages 166-179.

    More about this item

    Keywords

    ;
    ;
    ;

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:csdana:v:208:y:2025:i:c:s0167947325000301. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/csda .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.