IDEAS home Printed from https://ideas.repec.org/a/gam/jmathe/v12y2024i7p951-d1362481.html
   My bibliography  Save this article

Imputation-Based Variable Selection Method for Block-Wise Missing Data When Integrating Multiple Longitudinal Studies

Author

Listed:
  • Zhongzhe Ouyang

    (Department of Biostatistics, University of Michigan, Ann Arbor, MI 48109, USA)

  • Lu Wang

    (Department of Biostatistics, University of Michigan, Ann Arbor, MI 48109, USA)

  • Alzheimer’s Disease Neuroimaging Initiative

    (Department of Biostatistics, University of Michigan, Ann Arbor, MI 48109, USA
    Data used in preparation of this article were obtained from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) database ( adni.loni.usc.edu ). As such, the investigators within the ADNI contributed to the design and implementation of ADNI and/or provided data but did not participate in analysis or writing of this report. A complete listing of ADNI investigators can be found at: http://adni.loni.usc.edu/wp-content/uploads/how_to_apply/ADNI_Acknowledgement_List.pdf (accessed on 29 February 2024).)

Abstract

When integrating data from multiple sources, a common challenge is block-wise missing. Most existing methods address this issue only in cross-sectional studies. In this paper, we propose a method for variable selection when combining datasets from multiple sources in longitudinal studies. To account for block-wise missing in covariates, we impute the missing values multiple times based on combinations of samples from different missing pattern and predictors from different data sources. We then use these imputed data to construct estimating equations, and aggregate the information across subjects and sources with the generalized method of moments. We employ the smoothly clipped absolute deviation penalty in variable selection and use the extended Bayesian Information Criterion criteria for tuning parameter selection. We establish the asymptotic properties of the proposed estimator, and demonstrate the superior performance of the proposed method through numerical experiments. Furthermore, we apply the proposed method in the Alzheimer’s Disease Neuroimaging Initiative study to identify sensitive early-stage biomarkers of Alzheimer’s Disease, which is crucial for early disease detection and personalized treatment.

Suggested Citation

  • Zhongzhe Ouyang & Lu Wang & Alzheimer’s Disease Neuroimaging Initiative, 2024. "Imputation-Based Variable Selection Method for Block-Wise Missing Data When Integrating Multiple Longitudinal Studies," Mathematics, MDPI, vol. 12(7), pages 1-14, March.
  • Handle: RePEc:gam:jmathe:v:12:y:2024:i:7:p:951-:d:1362481
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2227-7390/12/7/951/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2227-7390/12/7/951/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Hansen, Lars Peter, 1982. "Large Sample Properties of Generalized Method of Moments Estimators," Econometrica, Econometric Society, vol. 50(4), pages 1029-1054, July.
    2. Fan J. & Li R., 2001. "Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties," Journal of the American Statistical Association, American Statistical Association, vol. 96, pages 1348-1360, December.
    3. Fei Xue & Annie Qu, 2021. "Integrating Multisource Block-Wise Missing Data in Model Selection," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 116(536), pages 1914-1927, October.
    4. José R. Zubizarreta, 2015. "Stable Weights that Balance Covariates for Estimation With Incomplete Outcome Data," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 110(511), pages 910-922, September.
    5. Tian, Ruiqin & Xue, Liugen & Liu, Chunling, 2014. "Penalized quadratic inference functions for semiparametric varying coefficient partially linear models with longitudinal data," Journal of Multivariate Analysis, Elsevier, vol. 132(C), pages 94-110.
    6. Guan Yu & Quefeng Li & Dinggang Shen & Yufeng Liu, 2020. "Optimal Sparse Linear Prediction for Block-missing Multi-modality Data Without Imputation," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 115(531), pages 1406-1419, July.
    7. Wang, Tao & Zhu, Lixing, 2011. "Consistent tuning parameter selection in high dimensional sparse linear regression," Journal of Multivariate Analysis, Elsevier, vol. 102(7), pages 1141-1151, August.
    8. Johnson, Brent A. & Lin, D.Y. & Zeng, Donglin, 2008. "Penalized Estimating Functions and Variable Selection in Semiparametric Regression Models," Journal of the American Statistical Association, American Statistical Association, vol. 103, pages 672-680, June.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Dasom Lee & Shu Yang & Lin Dong & Xiaofei Wang & Donglin Zeng & Jianwen Cai, 2023. "Improving trial generalizability using observational studies," Biometrics, The International Biometric Society, vol. 79(2), pages 1213-1225, June.
    2. Chang, Jinyuan & Chen, Song Xi & Chen, Xiaohong, 2015. "High dimensional generalized empirical likelihood for moment restrictions with dependent data," Journal of Econometrics, Elsevier, vol. 185(1), pages 283-304.
    3. Xiuli Du & Xiaohu Jiang & Jinguan Lin, 2023. "Multinomial Logistic Factor Regression for Multi-source Functional Block-wise Missing Data," Psychometrika, Springer;The Psychometric Society, vol. 88(3), pages 975-1001, September.
    4. Victor Chernozhukov & Ivan Fernandez-Val & Christian Hansen, 2013. "Program evaluation with high-dimensional data," CeMMAP working papers CWP57/13, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    5. Joseph G. Ibrahim & Hongtu Zhu & Ramon I. Garcia & Ruixin Guo, 2011. "Fixed and Random Effects Selection in Mixed Effects Models," Biometrics, The International Biometric Society, vol. 67(2), pages 495-503, June.
    6. Yunxiao Chen & Xiaoou Li & Jingchen Liu & Zhiliang Ying, 2017. "Regularized Latent Class Analysis with Application in Cognitive Diagnosis," Psychometrika, Springer;The Psychometric Society, vol. 82(3), pages 660-692, September.
    7. Zhang, Ting & Wang, Lei, 2020. "Smoothed empirical likelihood inference and variable selection for quantile regression with nonignorable missing response," Computational Statistics & Data Analysis, Elsevier, vol. 144(C).
    8. Xiaoran Liang & Eleanor Sanderson & Frank Windmeijer, 2022. "Selecting Valid Instrumental Variables in Linear Models with Multiple Exposure Variables: Adaptive Lasso and the Median-of-Medians Estimator," Papers 2208.05278, arXiv.org.
    9. Yongjin Li & Qingzhao Zhang & Qihua Wang, 2017. "Penalized estimation equation for an extended single-index model," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 69(1), pages 169-187, February.
    10. Zhang, Tonglin, 2024. "Variables selection using L0 penalty," Computational Statistics & Data Analysis, Elsevier, vol. 190(C).
    11. Zhangong Zhou & Rong Jiang & Weimin Qian, 2013. "LAD variable selection for linear models with randomly censored data," Metrika: International Journal for Theoretical and Applied Statistics, Springer, vol. 76(2), pages 287-300, February.
    12. Brittany Green & Heng Lian & Yan Yu & Tianhai Zu, 2021. "Ultra high‐dimensional semiparametric longitudinal data analysis," Biometrics, The International Biometric Society, vol. 77(3), pages 903-913, September.
    13. Tamar Sofer & Elizabeth D. Schifano & David C. Christiani & Xihong Lin, 2017. "Weighted pseudolikelihood for SNP set analysis with multiple secondary outcomes in case‐control genetic association studies," Biometrics, The International Biometric Society, vol. 73(4), pages 1210-1220, December.
    14. Xiaochao Xia & Binyan Jiang & Jialiang Li & Wenyang Zhang, 2016. "Low-dimensional confounder adjustment and high-dimensional penalized estimation for survival analysis," Lifetime Data Analysis: An International Journal Devoted to Statistical Methods and Applications for Time-to-Event Data, Springer, vol. 22(4), pages 547-569, October.
    15. Xingwei Tong & Xin He & Liuquan Sun & Jianguo Sun, 2009. "Variable Selection for Panel Count Data via Non‐Concave Penalized Estimating Function," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 36(4), pages 620-635, December.
    16. Fan, Jianqing & Liao, Yuan, 2012. "Endogeneity in ultrahigh dimension," MPRA Paper 38698, University Library of Munich, Germany.
    17. Jin, Fei & Lee, Lung-fei, 2018. "Irregular N2SLS and LASSO estimation of the matrix exponential spatial specification model," Journal of Econometrics, Elsevier, vol. 206(2), pages 336-358.
    18. A. Belloni & V. Chernozhukov & I. Fernández‐Val & C. Hansen, 2017. "Program Evaluation and Causal Inference With High‐Dimensional Data," Econometrica, Econometric Society, vol. 85, pages 233-298, January.
    19. Fei Wang & Lu Wang & Peter X.‐K. Song, 2016. "Fused lasso with the adaptation of parameter ordering in combining multiple studies with repeated measurements," Biometrics, The International Biometric Society, vol. 72(4), pages 1184-1193, December.
    20. Dong, Chaohua & Gao, Jiti & Linton, Oliver, 2023. "High dimensional semiparametric moment restriction models," Journal of Econometrics, Elsevier, vol. 232(2), pages 320-345.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jmathe:v:12:y:2024:i:7:p:951-:d:1362481. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.