IDEAS home Printed from https://ideas.repec.org/a/bla/biomet/v79y2023i4p2947-2960.html
   My bibliography  Save this article

An efficient data integration scheme for synthesizing information from multiple secondary datasets for the parameter inference of the main analysis

Author

Listed:
  • Chixiang Chen
  • Ming Wang
  • Shuo Chen

Abstract

Many observational studies and clinical trials collect various secondary outcomes that may be highly correlated with the primary endpoint. These secondary outcomes are often analyzed in secondary analyses separately from the main data analysis. However, these secondary outcomes can be used to improve the estimation precision in the main analysis. We propose a method called multiple information borrowing (MinBo) that borrows information from secondary data (containing secondary outcomes and covariates) to improve the efficiency of the main analysis. The proposed method is robust against model misspecification of the secondary data. Both theoretical and case studies demonstrate that MinBo outperforms existing methods in terms of efficiency gain. We apply MinBo to data from the Atherosclerosis Risk in Communities study to assess risk factors for hypertension.

Suggested Citation

  • Chixiang Chen & Ming Wang & Shuo Chen, 2023. "An efficient data integration scheme for synthesizing information from multiple secondary datasets for the parameter inference of the main analysis," Biometrics, The International Biometric Society, vol. 79(4), pages 2947-2960, December.
  • Handle: RePEc:bla:biomet:v:79:y:2023:i:4:p:2947-2960
    DOI: 10.1111/biom.13858
    as

    Download full text from publisher

    File URL: https://doi.org/10.1111/biom.13858
    Download Restriction: no

    File URL: https://libkey.io/10.1111/biom.13858?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Nilanjan Chatterjee & Yi-Hau Chen & Paige Maas & Raymond J. Carroll, 2016. "Constrained Maximum Likelihood Estimation for Model Calibration Using Summary-Level Information From External Big Data Sources," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 111(513), pages 107-117, March.
    2. Jing Qin & Yukun Liu & Pengfei Li, 2022. "A selective review of statistical methods using calibration information from similar studies," Statistical Theory and Related Fields, Taylor & Francis Journals, vol. 6(3), pages 175-190, August.
    3. Jing Qin & Han Zhang & Pengfei Li & Demetrius Albanes & Kai Yu, 2015. "Using covariate-specific disease prevalence information to increase the power of case-control studies," Biometrika, Biometrika Trust, vol. 102(1), pages 169-180.
    4. Jing Qin & Yukun Liu & Pengfei Li, 2022. "Rejoinder on “A selective review of statistical methods using calibration information from similar studies”," Statistical Theory and Related Fields, Taylor & Francis Journals, vol. 6(3), pages 204-207, August.
    5. Thomas Lumley & Pamela A. Shaw & James Y. Dai, 2011. "Connections between Survey Calibration Estimators and Semiparametric Models for Incomplete Data," International Statistical Review, International Statistical Institute, vol. 79(2), pages 200-220, August.
    6. Peisong Han, 2014. "Multiply Robust Estimation in Regression Analysis With Missing Data," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 109(507), pages 1159-1173, September.
    7. Chixiang Chen & Biyi Shen & Aiyi Liu & Rongling Wu & Ming Wang, 2021. "A multiple robust propensity score method for longitudinal analysis with intermittent missing data," Biometrics, The International Biometric Society, vol. 77(2), pages 519-532, June.
    8. Shu Yang & Peng Ding, 2020. "Combining Multiple Observational Data Sources to Estimate Causal Effects," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 115(531), pages 1540-1554, July.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Ruoyu Wang & Qihua Wang & Wang Miao, 2023. "A robust fusion-extraction procedure with summary statistics in the presence of biased sources," Biometrika, Biometrika Trust, vol. 110(4), pages 1023-1040.
    2. Tian Gu & Jeremy Michael George Taylor & Bhramar Mukherjee, 2023. "A synthetic data integration framework to leverage external summary‐level information from heterogeneous populations," Biometrics, The International Biometric Society, vol. 79(4), pages 3831-3845, December.
    3. Han Zhang & Lu Deng & William Wheeler & Jing Qin & Kai Yu, 2022. "Integrative analysis of multiple case‐control studies," Biometrics, The International Biometric Society, vol. 78(3), pages 1080-1091, September.
    4. Ying Sheng & Yifei Sun & Chiung‐Yu Huang & Mi‐Ok Kim, 2022. "Synthesizing external aggregated information in the presence of population heterogeneity: A penalized empirical likelihood approach," Biometrics, The International Biometric Society, vol. 78(2), pages 679-690, June.
    5. Cao, Yongxiu & Yu, Jichang, 2023. "Adjusting for unmeasured confounding in survival causal effect using validation data," Computational Statistics & Data Analysis, Elsevier, vol. 180(C).
    6. Jie He & Hui Li & Shumei Zhang & Xiaogang Duan, 2019. "Additive hazards model with auxiliary subgroup survival information," Lifetime Data Analysis: An International Journal Devoted to Statistical Methods and Applications for Time-to-Event Data, Springer, vol. 25(1), pages 128-149, January.
    7. Ziqi Chen & Jing Ning & Yu Shen & Jing Qin, 2021. "Combining primary cohort data with external aggregate information without assuming comparability," Biometrics, The International Biometric Society, vol. 77(3), pages 1024-1036, September.
    8. Prosenjit Kundu & Nilanjan Chatterjee, 2023. "Logistic regression analysis of two‐phase studies using generalized method of moments," Biometrics, The International Biometric Society, vol. 79(1), pages 241-252, March.
    9. Shixiao Zhang & Peisong Han & Changbao Wu, 2023. "Calibration Techniques Encompassing Survey Sampling, Missing Data Analysis and Causal Inference," International Statistical Review, International Statistical Institute, vol. 91(2), pages 165-192, August.
    10. Fei Gao & K. C. G. Chan, 2023. "Noniterative adjustment to regression estimators with population‐based auxiliary information for semiparametric models," Biometrics, The International Biometric Society, vol. 79(1), pages 140-150, March.
    11. Li, Wei & Luo, Shanshan & Xu, Wangli, 2024. "Calibrated regression estimation using empirical likelihood under data fusion," Computational Statistics & Data Analysis, Elsevier, vol. 190(C).
    12. Wei, Kecheng & Qin, Guoyou & Zhang, Jiajia & Sui, Xuemei, 2022. "Doubly robust estimation in causal inference with missing outcomes: With an application to the Aerobics Center Longitudinal Study," Computational Statistics & Data Analysis, Elsevier, vol. 168(C).
    13. Debashis Ghosh & Michael S. Sabel, 2022. "A Weighted Sample Framework to Incorporate External Calculators for Risk Modeling," Statistics in Biosciences, Springer;International Chinese Statistical Association, vol. 14(3), pages 363-379, December.
    14. Xiong, Wei & Wang, Dehui & Deng, Dianliang & Wang, Xinyang & Zhang, Wanying, 2022. "Penalized multiply robust estimation in high-order autoregressive processes with missing explanatory variables," Journal of Multivariate Analysis, Elsevier, vol. 187(C).
    15. Peisong Han & Linglong Kong & Jiwei Zhao & Xingcai Zhou, 2019. "A general framework for quantile estimation with incomplete data," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 81(2), pages 305-333, April.
    16. Gustavo Amorim & Ran Tao & Sarah Lotspeich & Pamela A. Shaw & Thomas Lumley & Bryan E. Shepherd, 2021. "Two‐phase sampling designs for data validation in settings with covariate measurement error and continuous outcome," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 184(4), pages 1368-1389, October.
    17. Brick J. Michael, 2013. "Unit Nonresponse and Weighting Adjustments: A Critical Review," Journal of Official Statistics, Sciendo, vol. 29(3), pages 329-353, June.
    18. Mengke Li & Yan Fan & Yang Liu & Yukun Liu, 2021. "Diagnostic test meta-analysis by empirical likelihood under a Copas-like selection model," Metrika: International Journal for Theoretical and Applied Statistics, Springer, vol. 84(6), pages 927-947, August.
    19. Wei, Yuting & Wang, Qihua, 2021. "Cross-validation-based model averaging in linear models with response missing at random," Statistics & Probability Letters, Elsevier, vol. 171(C).
    20. Hamori, Shigeyuki & Motegi, Kaiji & Zhang, Zheng, 2019. "Calibration estimation of semiparametric copula models with data missing at random," Journal of Multivariate Analysis, Elsevier, vol. 173(C), pages 85-109.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bla:biomet:v:79:y:2023:i:4:p:2947-2960. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Wiley Content Delivery (email available below). General contact details of provider: http://www.blackwellpublishing.com/journal.asp?ref=0006-341X .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.