IDEAS home Printed from https://ideas.repec.org/a/spr/compst/v40y2025i2d10.1007_s00180-024-01521-1.html
   My bibliography  Save this article

Variable selection and structure identification for additive models with longitudinal data

Author

Listed:
  • Ting Wang

    (Xi’an Jiaotong University)

  • Liya Fu

    (Xi’an Jiaotong University)

  • Yanan Song

    (Xi’an Jiaotong University)

Abstract

This paper proposes a polynomial structure identification (PSI) method for variable selection and model structure identification of additive models with longitudinal data. First, the backfitting algorithm and zero-order local polynomial smoothing method are used to select important variables in the additive model, and the importance of variables is determined through the inverse of the bandwidth parameter in the nonparametric partial kernel function. Second, the backfitting algorithm and Q-order local polynomial smoothing method are utilized to identify the specific structure of each selected predictor. To incorporate correlations within longitudinal data, a two-stage estimation method is proposed for estimating the regression parameters of the identified important variables: (i) Parameter estimators of the important variables are firstly obtained under an independence working model assumption; (ii) Generalized estimating equations with a working correlation matrix based on B-splines are constructed to obtain the final estimators of the parameters, which improve the efficiency of parameter estimation. Finally, simulation studies are carried out to evaluate the performance of the proposed method, followed by the presentation of two real-world examples for illustration.

Suggested Citation

  • Ting Wang & Liya Fu & Yanan Song, 2025. "Variable selection and structure identification for additive models with longitudinal data," Computational Statistics, Springer, vol. 40(2), pages 951-975, February.
  • Handle: RePEc:spr:compst:v:40:y:2025:i:2:d:10.1007_s00180-024-01521-1
    DOI: 10.1007/s00180-024-01521-1
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s00180-024-01521-1
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s00180-024-01521-1?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Eva Cantoni & Joanna Mills Flemming & Elvezio Ronchetti, 2005. "Variable Selection for Marginal Longitudinal Generalized Linear Models," Biometrics, The International Biometric Society, vol. 61(2), pages 507-514, June.
    2. Wei Pan, 2001. "Akaike's Information Criterion in Generalized Estimating Equations," Biometrics, The International Biometric Society, vol. 57(1), pages 120-125, March.
    3. Lian, Heng, 2012. "Shrinkage estimation for identification of linear components in additive models," Statistics & Probability Letters, Elsevier, vol. 82(2), pages 225-231.
    4. Xue, Lan & Qu, Annie & Zhou, Jianhui, 2010. "Consistent Model Selection for Marginal Generalized Additive Model for Correlated Data," Journal of the American Statistical Association, American Statistical Association, vol. 105(492), pages 1518-1530.
    5. Yichao Wu & Leonard A. Stefanski, 2015. "Automatic structure recovery for additive models," Biometrika, Biometrika Trust, vol. 102(2), pages 381-395.
    6. Wenjiang J. Fu, 2003. "Penalized Estimating Equations," Biometrics, The International Biometric Society, vol. 59(1), pages 126-132, March.
    7. Fan, Yali & Qin, Guoyou & Zhu, Zhongyi, 2012. "Variable selection in robust regression models for longitudinal data," Journal of Multivariate Analysis, Elsevier, vol. 109(C), pages 156-167.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Li, Gaorong & Lian, Heng & Feng, Sanying & Zhu, Lixing, 2013. "Automatic variable selection for longitudinal generalized linear models," Computational Statistics & Data Analysis, Elsevier, vol. 61(C), pages 174-186.
    2. Geronimi, J. & Saporta, G., 2017. "Variable selection for multiply-imputed data with penalized generalized estimating equations," Computational Statistics & Data Analysis, Elsevier, vol. 110(C), pages 103-114.
    3. Fan, Yali & Qin, Guoyou & Zhu, Zhongyi, 2012. "Variable selection in robust regression models for longitudinal data," Journal of Multivariate Analysis, Elsevier, vol. 109(C), pages 156-167.
    4. Lan Wang & Jianhui Zhou & Annie Qu, 2012. "Penalized Generalized Estimating Equations for High-Dimensional Longitudinal Data Analysis," Biometrics, The International Biometric Society, vol. 68(2), pages 353-360, June.
    5. Yang, Jing & Yang, Hu, 2016. "A robust penalized estimation for identification in semiparametric additive models," Statistics & Probability Letters, Elsevier, vol. 110(C), pages 268-277.
    6. Blommaert, A. & Hens, N. & Beutels, Ph., 2014. "Data mining for longitudinal data under multicollinearity and time dependence using penalized generalized estimating equations," Computational Statistics & Data Analysis, Elsevier, vol. 71(C), pages 667-680.
    7. Lan Wang & Annie Qu, 2009. "Consistent model selection and data‐driven smooth tests for longitudinal data in the estimating equations approach," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 71(1), pages 177-190, January.
    8. Jakub Stoklosa & Heloise Gibb & David I. Warton, 2014. "Fast forward selection for generalized estimating equations with a large number of predictor variables," Biometrics, The International Biometric Society, vol. 70(1), pages 110-120, March.
    9. Chung-Wei Shen & Yi-Hau Chen, 2012. "Model Selection for Generalized Estimating Equations Accommodating Dropout Missingness," Biometrics, The International Biometric Society, vol. 68(4), pages 1046-1054, December.
    10. Shinpei Imori, 2015. "Model Selection Criterion Based on the Multivariate Quasi-Likelihood for Generalized Estimating Equations," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 42(4), pages 1214-1224, December.
    11. Chung‐Wei Shen & Yi‐Hau Chen, 2018. "Model selection for semiparametric marginal mean regression accounting for within‐cluster subsampling variability and informative cluster size," Biometrics, The International Biometric Society, vol. 74(3), pages 934-943, September.
    12. Hu Yang & Chaohui Guo & Jing Lv, 2016. "Variable selection for generalized varying coefficient models with longitudinal data," Statistical Papers, Springer, vol. 57(1), pages 115-132, March.
    13. Facheng Li & Huilan Liu, 2025. "Kernel density regression in the additive model: a B-spline approach," Statistical Papers, Springer, vol. 66(1), pages 1-23, January.
    14. Wei Pan, 2001. "Model Selection in Estimating Equations," Biometrics, The International Biometric Society, vol. 57(2), pages 529-534, June.
    15. Hang Yu & Yuanjia Wang & Donglin Zeng, 2023. "A general framework of nonparametric feature selection in high‐dimensional data," Biometrics, The International Biometric Society, vol. 79(2), pages 951-963, June.
    16. Vens, Maren & Ziegler, Andreas, 2012. "Generalized estimating equations and regression diagnostics for longitudinal controlled clinical trials: A case study," Computational Statistics & Data Analysis, Elsevier, vol. 56(5), pages 1232-1242.
    17. Gregory Vaughan & Robert Aseltine & Kun Chen & Jun Yan, 2017. "Stagewise generalized estimating equations with grouped variables," Biometrics, The International Biometric Society, vol. 73(4), pages 1332-1342, December.
    18. Michael S. Rendall & Bonnie Ghosh-Dastidar & Margaret M. Weden & Zafar Nazarov, 2011. "Multiple Imputation for Combined-Survey Estimation With Incomplete Regressors In One But Not Both Surveys," Working Papers WR-887-1, RAND Corporation.
    19. Katrina N. Burns & Kan Sun & Julius N. Fobil & Richard L. Neitzel, 2016. "Heart Rate, Stress, and Occupational Noise Exposure among Electronic Waste Recycling Workers," IJERPH, MDPI, vol. 13(1), pages 1-16, January.
    20. Song Guo & Feng Ling & Juan Hou & Jinna Wang & Guiming Fu & Zhenyu Gong, 2014. "Mosquito Surveillance Revealed Lagged Effects of Mosquito Abundance on Mosquito-Borne Disease Transmission: A Retrospective Study in Zhejiang, China," PLOS ONE, Public Library of Science, vol. 9(11), pages 1-8, November.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:compst:v:40:y:2025:i:2:d:10.1007_s00180-024-01521-1. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.