IDEAS home Printed from https://ideas.repec.org/a/gam/jdataj/v10y2025i12p195-d1801747.html

DECOVID: A UK Two-Center Harmonized Database of Acute Care Electronic Health Records for COVID-19 Research

Author

Listed:
  • DECOVID Consortium
  • Louis J. M. Aslett

    (Department of Mathematical Sciences, Durham University, Durham DH1 3LE, UK)

  • Andreea Avramescu

    (The Alan Turing Institute, London NW1 2DB, UK)

  • Nicholas Bakewell

    (MRC Biostatistics Unit, University of Cambridge, Cambridge CB2 0SR, UK)

  • Isabel Birds

    (School of Molecular and Cellular Biology, University of Leeds, Leeds LS2 9JT, UK
    LeedsOmics, University of Leeds, Leeds LS2 9JT, UK)

  • Louise Bowler

    (The Alan Turing Institute, London NW1 2DB, UK)

  • Michael P. J. Camilleri

    (School of Informatics, University of Edinburgh, Edinburgh EH4 2XU, UK
    School of Science and Engineering, University of Dundee, Dundee DD1 4HN, UK)

  • Sheng-Chia Chung

    (Institute of Cardiovascular Science, University College London, London NW1 2DA, UK)

  • David A. Clifton

    (Department of Engineering Science, University of Oxford, Oxford OX3 7DQ, UK)

  • Samuel N. Cohen

    (Mathematical Institute, University of Oxford, Oxford OX2 6GG, UK)

  • Nathan Constantine-Cooke

    (Institute of Genetics and Cancer, University of Edinburgh, Edinburgh EH4 2XU, UK)

  • Eric G. Daub

    (The Alan Turing Institute, London NW1 2DB, UK)

  • Shaun Davidson

    (Institute of Biomedical Engineering, University of Oxford, Oxford OX3 7DQ, UK)

  • Spiros Denaxas

    (Institute of Health Informatics, University College London, London NW1 2DA, UK)

  • Karla Diaz-Ordaz

    (Department of Statistical Science, University College London, London WC1E 6BT, UK)

  • Richard Feltbower

    (Child Health Outcomes Research at Leeds (CHORAL), School of Medicine, University of Leeds, Leeds LS2 9LU, UK
    Leeds Institute for Data Analytics, University of Leeds, Leeds LS2 9NL, UK)

  • Suzy Gallier

    (University Hospitals Birmingham NHS Foundation Trust, Birmingham B15 2GW, UK
    Department of Inflammation and Ageing, School of Inflammation, Infection and Immunity, College of Medicine and Health, University of Birmingham, Birmingham B15 2WB, UK)

  • Stephen Gardiner

    (Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, Nuffield Department of Medicine, University of Oxford, Oxford OX3 7LF, UK)

  • Francesca Gasperoni

    (MRC Biostatistics Unit, University of Cambridge, Cambridge CB2 0SR, UK)

  • Robert J. B. Goudie

    (MRC Biostatistics Unit, University of Cambridge, Cambridge CB2 0SR, UK)

  • Rebecca E. Green

    (The Alan Turing Institute, London NW1 2DB, UK)

  • Marlous Hall

    (Leeds Institute for Data Analytics, University of Leeds, Leeds LS2 9NL, UK
    Leeds Institute of Cardiovascular and Metabolic Medicine, University of Leeds, Leeds LS2 9JT, UK)

  • Chris Holmes

    (Department of Statistics, University of Oxford, Oxford OX1 3LB, UK)

  • John R. Hurst

    (UCL Respiratory, University College London, London WC1E 6JF, UK)

  • Mark M. Iles

    (Leeds Institute for Data Analytics, University of Leeds, Leeds LS2 9NL, UK
    NIHR Leeds Biomedical Research Centre, Leeds Teaching Hospitals NHS Trust, Leeds LS7 4SA, UK)

  • Joao Jorge

    (Institute of Biomedical Engineering, University of Oxford, Oxford OX3 7DQ, UK
    NIHR Biomedical Research Centre, Oxford OX3 9DU, UK)

  • Emma Karoune

    (The Alan Turing Institute, London NW1 2DB, UK)

  • Ruth Keogh

    (Department of Medical Statistics, London School of Hygiene and Tropical Medicine, London WC1E 7HT, UK)

  • Ruairidh King

    (The Alan Turing Institute, London NW1 2DB, UK)

  • Ruth King

    (School of Mathematics and Maxwell Institute for Mathematical Sciences, University of Edinburgh, Edinburgh EH9 3FD, UK)

  • Paul D. W. Kirk

    (MRC Biostatistics Unit, University of Cambridge, Cambridge CB2 0SR, UK
    Cambridge Institute of Therapeutic Immunology & Infectious Disease (CITIID), University of Cambridge, Cambridge CB2 0AW, UK)

  • Roman Klapaukh

    (Research Software Development Group, University College London, London WC1E 6BT, UK
    University College London Hospital, London NW1 2BU, UK)

  • Samaneh Kouchaki

    (Institute of Biomedical Engineering, University of Oxford, Oxford OX3 7DQ, UK
    School of Computer Science and Electronic Engineering, University of Surrey, Guildford GU2 7XH, UK)

  • Alvina G. Lai

    (Institute of Health Informatics, University College London, London NW1 2DA, UK)

  • Nathan Lea

    (Institute of Health Informatics, University College London, London NW1 2DA, UK)

  • Clemence Leyrat

    (Department of Medical Statistics, London School of Hygiene and Tropical Medicine, London WC1E 7HT, UK)

  • Kezhi Li

    (Institute of Health Informatics, University College London, London NW1 2DA, UK)

  • Watjana Lilaonitkul

    (Global Business School for Health, University College London, London E20 2AE, UK)

  • Huiqi Y. Lu

    (Department of Engineering Science, University of Oxford, Oxford OX3 7DQ, UK)

  • Terry Lyons

    (Mathematical Institute, University of Oxford, Oxford OX2 6GG, UK)

  • Ann Marie Mallon

    (The Alan Turing Institute, London NW1 2DB, UK)

  • Andrew Manderson

    (MRC Biostatistics Unit, University of Cambridge, Cambridge CB2 0SR, UK)

  • Nicolò Margaritella

    (School of Mathematics and Statistics, University of St Andrews, St Andrews KY16 9SS, UK)

  • Joshua Matteson

    (Institute of Health Informatics, University College London, London NW1 2DA, UK)

  • Sam Morley

    (Mathematical Institute, University of Oxford, Oxford OX2 6GG, UK)

  • Hannah Nicholls

    (The Alan Turing Institute, London NW1 2DB, UK)

  • Martin O’Reilly

    (The Alan Turing Institute, London NW1 2DB, UK)

  • Christina Pagel

    (Clinical Operational Research Unit, University College London, London WC1H 0BT, UK)

  • Edward Palmer

    (University College London Hospital, London NW1 2BU, UK
    Bloomsbury Institute of Intensive Care Medicine, University College London, London WC1E 6BT, UK
    Whittington Hospital, London N19 5NF, UK)

  • Jack Roberts

    (The Alan Turing Institute, London NW1 2DB, UK)

  • Timothy J. Roberts

    (Institute of Health Informatics, University College London, London NW1 2DA, UK
    University College London Hospital, London NW1 2BU, UK)

  • David S. Robertson

    (MRC Biostatistics Unit, University of Cambridge, Cambridge CB2 0SR, UK)

  • James Robinson

    (The Alan Turing Institute, London NW1 2DB, UK)

  • Patrick Rockenschaub

    (Institute of Health Informatics, University College London, London NW1 2DA, UK)

  • Roy Ruddle

    (Leeds Institute for Data Analytics, University of Leeds, Leeds LS2 9NL, UK
    School of Computer Science, University of Leeds, Leeds LS2 9JT, UK)

  • Elizabeth Sapey

    (University Hospitals Birmingham NHS Foundation Trust, Birmingham B15 2GW, UK
    Department of Inflammation and Ageing, School of Inflammation, Infection and Immunity, College of Medicine and Health, University of Birmingham, Birmingham B15 2WB, UK)

  • Luis Santos

    (The Alan Turing Institute, London NW1 2DB, UK
    Medical Research Council Harwell Institute (Mammalian Genetics Unit and Mary Lyon Center), Harwell OX11 0RD, UK)

  • Andrew A. S. Soltan

    (Department of Oncology, University of Oxford, Oxford OX3 7LE, UK
    Oxford University Hospitals NHS Foundation Trust, Oxford OX3 9DU, UK)

  • Fang Gao Smith

    (University Hospitals Birmingham NHS Foundation Trust, Birmingham B15 2GW, UK
    Department of Inflammation and Ageing, School of Inflammation, Infection and Immunity, College of Medicine and Health, University of Birmingham, Birmingham B15 2WB, UK)

  • Colin Starr

    (MRC Biostatistics Unit, University of Cambridge, Cambridge CB2 0SR, UK)

  • Oliver Strickson

    (The Alan Turing Institute, London NW1 2DB, UK)

  • Li Su

    (MRC Biostatistics Unit, University of Cambridge, Cambridge CB2 0SR, UK)

  • Mia S. Tackney

    (MRC Biostatistics Unit, University of Cambridge, Cambridge CB2 0SR, UK)

  • Johan H. Thygesen

    (Institute of Health Informatics, University College London, London NW1 2DA, UK)

  • Ana Torralbo

    (Institute of Health Informatics, University College London, London NW1 2DA, UK)

  • Alice Turner

    (University Hospitals Birmingham NHS Foundation Trust, Birmingham B15 2GW, UK
    Department of Inflammation and Ageing, School of Inflammation, Infection and Immunity, College of Medicine and Health, University of Birmingham, Birmingham B15 2WB, UK)

  • Catalina A. Vallejos

    (Institute of Genetics and Cancer, University of Edinburgh, Edinburgh EH4 2XU, UK)

  • Chenyang Wang

    (Department of Engineering Science, University of Oxford, Oxford OX3 7DQ, UK)

  • Kirstie Whitaker

    (The Alan Turing Institute, London NW1 2DB, UK
    Berkeley Institute for Data Science, University of California at Berkeley, Berkeley, CA 94720-1234, USA)

  • Tony Whitehouse

    (University Hospitals Birmingham NHS Foundation Trust, Birmingham B15 2GW, UK
    Department of Inflammation and Ageing, School of Inflammation, Infection and Immunity, College of Medicine and Health, University of Birmingham, Birmingham B15 2WB, UK)

  • David R. Westhead

    (School of Molecular and Cellular Biology, University of Leeds, Leeds LS2 9JT, UK)

  • Wai Keong Wong

    (Cambridge University Hospitals, Cambridge CB2 0QQ, UK)

  • Yue Wu

    (Department of Mathematics and Statistics, University of Strathclyde, Glasgow G1 1XH, UK)

  • Lingyi Yang

    (Mathematical Institute, University of Oxford, Oxford OX2 6GG, UK)

  • Xiaoxu Zou

    (University Hospitals Birmingham NHS Foundation Trust, Birmingham B15 2GW, UK)

Abstract

The DECOVID database contains harmonized pseudonymized electronic health record (EHR) data on all adult (≥18 years old) patients presenting to two large, digitally mature centers in the United Kingdom between 1 January 2020 and 28 February 2021, with follow-up until at least 28 March 2021. The database was originally developed to support the COVID-19 response but is now available via the PIONEER data hub for researchers to explore a wide range of research questions, including exploratory analyses, risk factor assessment, prediction modeling, and comparative effectiveness studies. Raw data were extracted from local EHRs and transformed into a standardized form (Observational Health Data Sciences and Informatics-Common Data Model version 5.3.1). The database includes 165,420 patients across 256,804 hospital presentations. For these patients, highly granular data are available, including patient demographics, longitudinal vital signs, physiology, treatments, laboratory findings, clinical diagnoses, and outcomes. There are 10,030 patients with COVID-19, of whom 1472 died in hospital.

Suggested Citation

  • DECOVID Consortium & Louis J. M. Aslett & Andreea Avramescu & Nicholas Bakewell & Isabel Birds & Louise Bowler & Michael P. J. Camilleri & Sheng-Chia Chung & David A. Clifton & Samuel N. Cohen & Natha, 2025. "DECOVID: A UK Two-Center Harmonized Database of Acute Care Electronic Health Records for COVID-19 Research," Data, MDPI, vol. 10(12), pages 1-27, November.
  • Handle: RePEc:gam:jdataj:v:10:y:2025:i:12:p:195-:d:1801747
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2306-5729/10/12/195/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2306-5729/10/12/195/
    Download Restriction: no
    ---><---

    More about this item

    Keywords

    ;
    ;
    ;

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jdataj:v:10:y:2025:i:12:p:195-:d:1801747. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.