IDEAS home Printed from https://ideas.repec.org/a/bla/biomet/v78y2022i3p1080-1091.html
   My bibliography  Save this article

Integrative analysis of multiple case‐control studies

Author

Listed:
  • Han Zhang
  • Lu Deng
  • William Wheeler
  • Jing Qin
  • Kai Yu

Abstract

It is often challenging to share detailed individual‐level data among studies due to various informatics and privacy constraints. However, it is relatively easy to pool together aggregated summary level data, such as the ones required for standard meta‐analyses. Focusing on data generated from case‐control studies, we present a flexible inference procedure that integrates individual‐level data collected from an “internal” study with summary data borrowed from “external” studies. This procedure is built on a retrospective empirical likelihood framework to account for the sampling bias in case‐control studies. It can incorporate summary statistics extracted from various working models adopted by multiple independent or overlapping external studies. It also allows for external studies to be conducted in a population that is different from the internal study population. We show both theoretically and numerically its efficiency advantage over several competing alternatives.

Suggested Citation

  • Han Zhang & Lu Deng & William Wheeler & Jing Qin & Kai Yu, 2022. "Integrative analysis of multiple case‐control studies," Biometrics, The International Biometric Society, vol. 78(3), pages 1080-1091, September.
  • Handle: RePEc:bla:biomet:v:78:y:2022:i:3:p:1080-1091
    DOI: 10.1111/biom.13461
    as

    Download full text from publisher

    File URL: https://doi.org/10.1111/biom.13461
    Download Restriction: no

    File URL: https://libkey.io/10.1111/biom.13461?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Guido W. Imbens & Tony Lancaster, 1994. "Combining Micro and Macro Data in Microeconometric Models," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 61(4), pages 655-680.
    2. Nilanjan Chatterjee & Yi-Hau Chen & Paige Maas & Raymond J. Carroll, 2016. "Constrained Maximum Likelihood Estimation for Model Calibration Using Summary-Level Information From External Big Data Sources," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 111(513), pages 107-117, March.
    3. Alvaro N. Barbeira & Scott P. Dickinson & Rodrigo Bonazzola & Jiamao Zheng & Heather E. Wheeler & Jason M. Torres & Eric S. Torstenson & Kaanan P. Shah & Tzintzuni Garcia & Todd L. Edwards & Eli A. St, 2018. "Exploring the phenotypic consequences of tissue specific gene expression variation inferred from GWAS summary statistics," Nature Communications, Nature, vol. 9(1), pages 1-20, December.
    4. Sanjay Chaudhuri & Mark S. Handcock & Michael S. Rendall, 2008. "Generalized linear models incorporating population level information: an empirical‐likelihood‐based approach," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 70(2), pages 311-328, April.
    5. Jing Qin & Han Zhang & Pengfei Li & Demetrius Albanes & Kai Yu, 2015. "Using covariate-specific disease prevalence information to increase the power of case-control studies," Biometrika, Biometrika Trust, vol. 102(1), pages 169-180.
    6. Han Zhang & Lu Deng & Mark Schiffman & Jing Qin & Kai Yu, 2020. "Generalized integration model for improved statistical inference by leveraging external summary data," Biometrika, Biometrika Trust, vol. 107(3), pages 689-703.
    7. White, Halbert, 1982. "Maximum Likelihood Estimation of Misspecified Models," Econometrica, Econometric Society, vol. 50(1), pages 1-25, January.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Fei Gao & K. C. G. Chan, 2023. "Noniterative adjustment to regression estimators with population‐based auxiliary information for semiparametric models," Biometrics, The International Biometric Society, vol. 79(1), pages 140-150, March.
    2. Ruoyu Wang & Qihua Wang & Wang Miao, 2023. "A robust fusion-extraction procedure with summary statistics in the presence of biased sources," Biometrika, Biometrika Trust, vol. 110(4), pages 1023-1040.
    3. Ziqi Chen & Jing Ning & Yu Shen & Jing Qin, 2021. "Combining primary cohort data with external aggregate information without assuming comparability," Biometrics, The International Biometric Society, vol. 77(3), pages 1024-1036, September.
    4. Yu‐Jen Cheng & Yen‐Chun Liu & Chang‐Yu Tsai & Chiung‐Yu Huang, 2023. "Semiparametric estimation of the transformation model by leveraging external aggregate data in the presence of population heterogeneity," Biometrics, The International Biometric Society, vol. 79(3), pages 1996-2009, September.
    5. Igari, Ryosuke & Hoshino, Takahiro, 2018. "A Bayesian data combination approach for repeated durations under unobserved missing indicators: Application to interpurchase-timing in marketing," Computational Statistics & Data Analysis, Elsevier, vol. 126(C), pages 150-166.
    6. Cao, Yongxiu & Yu, Jichang, 2023. "Adjusting for unmeasured confounding in survival causal effect using validation data," Computational Statistics & Data Analysis, Elsevier, vol. 180(C).
    7. Jie He & Hui Li & Shumei Zhang & Xiaogang Duan, 2019. "Additive hazards model with auxiliary subgroup survival information," Lifetime Data Analysis: An International Journal Devoted to Statistical Methods and Applications for Time-to-Event Data, Springer, vol. 25(1), pages 128-149, January.
    8. Jason Allen & Robert Clark & Jean-François Houde, 2019. "Search Frictions and Market Power in Negotiated-Price Markets," Journal of Political Economy, University of Chicago Press, vol. 127(4), pages 1550-1598.
    9. Lee, Seojeong, 2014. "Asymptotic refinements of a misspecification-robust bootstrap for generalized method of moments estimators," Journal of Econometrics, Elsevier, vol. 178(P3), pages 398-413.
    10. Chixiang Chen & Ming Wang & Shuo Chen, 2023. "An efficient data integration scheme for synthesizing information from multiple secondary datasets for the parameter inference of the main analysis," Biometrics, The International Biometric Society, vol. 79(4), pages 2947-2960, December.
    11. Takahiro Hoshino & Ryosuke Igari, 2017. "Quasi-Bayesian Inference for Latent Variable Models with External Information: Application to generalized linear mixed models for biased data," Keio-IES Discussion Paper Series 2017-014, Institute for Economics Studies, Keio University.
    12. Liu, Tianqing & Yuan, Xiaohui, 2012. "Combining quasi and empirical likelihoods in generalized linear models with missing responses," Journal of Multivariate Analysis, Elsevier, vol. 111(C), pages 39-58.
    13. Tian Gu & Jeremy Michael George Taylor & Bhramar Mukherjee, 2023. "A synthetic data integration framework to leverage external summary‐level information from heterogeneous populations," Biometrics, The International Biometric Society, vol. 79(4), pages 3831-3845, December.
    14. Guevara, C. Angelo & Ben-Akiva, Moshe E., 2013. "Sampling of alternatives in Multivariate Extreme Value (MEV) models," Transportation Research Part B: Methodological, Elsevier, vol. 48(C), pages 31-52.
    15. Ying Sheng & Yifei Sun & Chiung‐Yu Huang & Mi‐Ok Kim, 2022. "Synthesizing external aggregated information in the presence of population heterogeneity: A penalized empirical likelihood approach," Biometrics, The International Biometric Society, vol. 78(2), pages 679-690, June.
    16. van Dijk, Bram & Paap, Richard, 2008. "Explaining individual response using aggregated data," Journal of Econometrics, Elsevier, vol. 146(1), pages 1-9, September.
    17. Yuan, Xiaohui & Liu, Tianqing & Lin, Nan & Zhang, Baoxue, 2010. "Combining conditional and unconditional moment restrictions with missing responses," Journal of Multivariate Analysis, Elsevier, vol. 101(10), pages 2420-2433, November.
    18. Ryo Kato & Takahiro Hoshino, 2020. "Semiparametric Bayesian Instrumental Variables Estimation for Nonignorable Missing Instruments," Discussion Paper Series DP2020-06, Research Institute for Economics & Business Administration, Kobe University.
    19. Prosenjit Kundu & Nilanjan Chatterjee, 2023. "Logistic regression analysis of two‐phase studies using generalized method of moments," Biometrics, The International Biometric Society, vol. 79(1), pages 241-252, March.
    20. Ryosuke Igari & Takahiro Hoshino, 2018. "A Bayesian Gamma Frailty Model Using the Sum of Independent Random Variables: Application of the Estimation of an Interpurchase Timing Model," Keio-IES Discussion Paper Series 2018-021, Institute for Economics Studies, Keio University.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bla:biomet:v:78:y:2022:i:3:p:1080-1091. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Wiley Content Delivery (email available below). General contact details of provider: http://www.blackwellpublishing.com/journal.asp?ref=0006-341X .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.