IDEAS home Printed from https://ideas.repec.org/p/osf/socarx/agqmu_v1.html

A data driven approach to handling missing data in the UK Millennium Cohort Study

Author

Listed:
  • Narayanan, Martina Kaja
  • Villadsen, Aase
  • Katsoulis, Michail
  • Dodgeon, Brian
  • Ploubidis, George
  • Fitzsimons, Emla
  • Silverwood, Richard J.

Abstract

Missing data arising from sweep non-response is a major challenge in longitudinal cohort studies, threatening statistical power and the validity of inferences. In the UK Millennium Cohort Study (MCS), non-response has increased substantially from sweep 1 (9 months old) to sweep 7 (17 years old), underscoring the need for robust strategies to handle non-response. We applied a systematic, data-driven approach to identify predictors of non-response at each sweep of the MCS, drawing on all available survey data at the time of analysis. The strongest and most consistent predictor of non-response was prior sweep non-response. Additional robust predictors included lower parental occupational social class, parental non-participation in the latest general elections, parent not being in paid work, higher cohort member’s age and lower cognitive test scores. We then evaluated whether incorporating the identified predictors of non-response as auxiliary variables in multiple imputation (MI) or as covariates in inverse probability weighting (IPW) improved sample representativeness. Validation analyses, using both external benchmarks (2021 Census) and internal comparisons to known early-life distributions, showed that MI and IPW models including the identified predictors substantially reduced or eliminated bias in key variables such as housing tenure and parental social class. Our findings demonstrate that the use of systematically identified auxiliary variables can improve the validity of inferences drawn from the MCS. The resulting predictor set offers a practical resource for applied researchers using MCS data and provides a replicable framework for addressing sweep non-response in other longitudinal studies.

Suggested Citation

  • Narayanan, Martina Kaja & Villadsen, Aase & Katsoulis, Michail & Dodgeon, Brian & Ploubidis, George & Fitzsimons, Emla & Silverwood, Richard J., 2026. "A data driven approach to handling missing data in the UK Millennium Cohort Study," SocArXiv agqmu_v1, Center for Open Science.
  • Handle: RePEc:osf:socarx:agqmu_v1
    DOI: 10.31219/osf.io/agqmu_v1
    as

    Download full text from publisher

    File URL: https://osf.io/download/698f542f3fab057afbdfc4f1/
    Download Restriction: no

    File URL: https://libkey.io/10.31219/osf.io/agqmu_v1?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:osf:socarx:agqmu_v1. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: OSF (email available below). General contact details of provider: https://arabixiv.org .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.