IDEAS home Printed from https://ideas.repec.org/a/nat/nathum/v7y2023i7d10.1038_s41562-023-01579-9.html
   My bibliography  Save this article

Participation bias in the UK Biobank distorts genetic associations and downstream analyses

Author

Listed:
  • Tabea Schoeler

    (University of Lausanne
    University College London)

  • Doug Speed

    (Aarhus University)

  • Eleonora Porcu

    (Lausanne University Hospital and University of Lausanne)

  • Nicola Pirastu

    (Human Technopole)

  • Jean-Baptiste Pingault

    (University College London
    King’s College London)

  • Zoltán Kutalik

    (University of Lausanne
    Swiss Institute of Bioinformatics
    University Center for Primary Care and Public Health)

Abstract

While volunteer-based studies such as the UK Biobank have become the cornerstone of genetic epidemiology, the participating individuals are rarely representative of their target population. To evaluate the impact of selective participation, here we derived UK Biobank participation probabilities on the basis of 14 variables harmonized across the UK Biobank and a representative sample. We then conducted weighted genome-wide association analyses on 19 traits. Comparing the output from weighted genome-wide association analyses (neffective = 94,643 to 102,215) with that from standard genome-wide association analyses (n = 263,464 to 283,749), we found that increasing representativeness led to changes in SNP effect sizes and identified novel SNP associations for 12 traits. While heritability estimates were less impacted by weighting (maximum change in h2, 5%), we found substantial discrepancies for genetic correlations (maximum change in rg, 0.31) and Mendelian randomization estimates (maximum change in βSTD, 0.15) for socio-behavioural traits. We urge the field to increase representativeness in biobank samples, especially when studying genetic correlates of behaviour, lifestyles and social outcomes.

Suggested Citation

  • Tabea Schoeler & Doug Speed & Eleonora Porcu & Nicola Pirastu & Jean-Baptiste Pingault & Zoltán Kutalik, 2023. "Participation bias in the UK Biobank distorts genetic associations and downstream analyses," Nature Human Behaviour, Nature, vol. 7(7), pages 1216-1227, July.
  • Handle: RePEc:nat:nathum:v:7:y:2023:i:7:d:10.1038_s41562-023-01579-9
    DOI: 10.1038/s41562-023-01579-9
    as

    Download full text from publisher

    File URL: https://www.nature.com/articles/s41562-023-01579-9
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1038/s41562-023-01579-9?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Angli Xue & Longda Jiang & Zhihong Zhu & Naomi R. Wray & Peter M. Visscher & Jian Zeng & Jian Yang, 2021. "Genome-wide analyses of behavioural traits are subject to bias by misreports and longitudinal changes," Nature Communications, Nature, vol. 12(1), pages 1-11, December.
    2. Abdel Abdellaoui & Karin J. H. Verweij, 2021. "Dissecting polygenic signals from genome-wide association studies on human behaviour," Nature Human Behaviour, Nature, vol. 5(6), pages 686-694, June.
    3. Jennifer Sjaarda & Zoltán Kutalik, 2023. "Partner choice, confounding and trait convergence all contribute to phenotypic partner similarity," Nature Human Behaviour, Nature, vol. 7(5), pages 776-789, May.
    4. Richard Border & Sean O’Rourke & Teresa de Candia & Michael E. Goddard & Peter M. Visscher & Loic Yengo & Matt Jones & Matthew C. Keller, 2022. "Assortative mating biases marker-based heritability estimators," Nature Communications, Nature, vol. 13(1), pages 1-10, December.
    5. Chris Frost & Simon G. Thompson, 2000. "Correcting for regression dilution bias: comparison of methods for a single predictor variable," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 163(2), pages 173-189.
    6. Andrew D. Grotzinger & Mijke Rhemtulla & Ronald Vlaming & Stuart J. Ritchie & Travis T. Mallard & W. David Hill & Hill F. Ip & Riccardo E. Marioni & Andrew M. McIntosh & Ian J. Deary & Philipp D. Koel, 2019. "Genomic structural equation modelling provides insights into the multivariate genetic architecture of complex traits," Nature Human Behaviour, Nature, vol. 3(5), pages 513-525, May.
    7. Qianqian Zhang & Florian Privé & Bjarni Vilhjálmsson & Doug Speed, 2021. "Improved genetic prediction of complex traits from individual-level data or summary statistics," Nature Communications, Nature, vol. 12(1), pages 1-9, December.
    8. Richard Border & Sean O’Rourke & Teresa de Candia & Michael E. Goddard & Peter M. Visscher & Loic Yengo & Matt Jones & Matthew C. Keller, 2022. "Author Correction: Assortative mating biases marker-based heritability estimators," Nature Communications, Nature, vol. 13(1), pages 1-1, December.
    9. Jessica Tyrrell & Jie Zheng & Robin Beaumont & Kathryn Hinton & Tom G. Richardson & Andrew R. Wood & George Davey Smith & Timothy M. Frayling & Kate Tilling, 2021. "Genetic predictors of participation in optional components of UK Biobank," Nature Communications, Nature, vol. 12(1), pages 1-13, December.
    10. Angli Xue & Longda Jiang & Zhihong Zhu & Naomi R. Wray & Peter M. Visscher & Jian Zeng & Jian Yang, 2021. "Publisher Correction: Genome-wide analyses of behavioural traits are subject to bias by misreports and longitudinal changes," Nature Communications, Nature, vol. 12(1), pages 1-1, December.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Max Korbmacher & Dennis Meer & Dani Beck & Ann-Marie G. de Lange & Eli Eikefjord & Arvid Lundervold & Ole A. Andreassen & Lars T. Westlye & Ivan I. Maximov, 2024. "Brain asymmetries from mid- to late life and hemispheric brain age," Nature Communications, Nature, vol. 15(1), pages 1-14, December.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Gianmarco Mignogna & Caitlin E. Carey & Robbee Wedow & Nikolas Baya & Mattia Cordioli & Nicola Pirastu & Rino Bellocco & Kathryn Fiuza Malerbi & Michel G. Nivard & Benjamin M. Neale & Raymond K. Walte, 2023. "Patterns of item nonresponse behaviour to survey questionnaires are systematic and associated with genetic loci," Nature Human Behaviour, Nature, vol. 7(8), pages 1371-1387, August.
    2. Clara Albiñana & Zhihong Zhu & Andrew J. Schork & Andrés Ingason & Hugues Aschard & Isabell Brikell & Cynthia M. Bulik & Liselotte V. Petersen & Esben Agerbo & Jakob Grove & Merete Nordentoft & David , 2023. "Multi-PGS enhances polygenic prediction by combining 937 polygenic scores," Nature Communications, Nature, vol. 14(1), pages 1-11, December.
    3. Hans Kippersluis & Pietro Biroli & Rita Dias Pereira & Titus J. Galama & Stephanie Hinke & S. Fleur W. Meddens & Dilnoza Muslimova & Eric A. W. Slob & Ronald Vlaming & Cornelius A. Rietveld, 2023. "Overcoming attenuation bias in regressions using polygenic indices," Nature Communications, Nature, vol. 14(1), pages 1-16, December.
    4. Jennifer Sjaarda & Zoltán Kutalik, 2023. "Partner choice, confounding and trait convergence all contribute to phenotypic partner similarity," Nature Human Behaviour, Nature, vol. 7(5), pages 776-789, May.
    5. Pietro Demela & Nicola Pirastu & Blagoje Soskic, 2023. "Cross-disorder genetic analysis of immune diseases reveals distinct gene associations that converge on common pathways," Nature Communications, Nature, vol. 14(1), pages 1-12, December.
    6. Wang, Chao & Zhan, Jinyan & Wang, Huihui & Yang, Zheng & Chu, Xi & Liu, Wei & Teng, Yanmin & Liu, Huizi & Wang, Yifan, 2022. "Multi-group analysis on the mechanism of residents' low-carbon behaviors in Beijing, China," Technological Forecasting and Social Change, Elsevier, vol. 183(C).
    7. Michael G. Levin & Noah L. Tsao & Pankhuri Singhal & Chang Liu & Ha My T. Vy & Ishan Paranjpe & Joshua D. Backman & Tiffany R. Bellomo & William P. Bone & Kiran J. Biddinger & Qin Hui & Ozan Dikilitas, 2022. "Genome-wide association and multi-trait analyses characterize the common genetic architecture of heart failure," Nature Communications, Nature, vol. 13(1), pages 1-15, December.
    8. González-Díaz, Julio & Gossner, Olivier & Rogers, Brian W., 2012. "Performing best when it matters most: Evidence from professional tennis," Journal of Economic Behavior & Organization, Elsevier, vol. 84(3), pages 767-781.
    9. Ma, Liye & Sun, Baohong, 2020. "Machine learning and AI in marketing – Connecting computing power to human insights," International Journal of Research in Marketing, Elsevier, vol. 37(3), pages 481-504.
    10. Procopio, Francesca & Zhou, Quan & Wang, Ziye & Gidziela, Agnieska & Rimfeld, Kaili & Malanchini, Margherita & Plomin, Robert, 2022. "The genetics of specific cognitive abilities," Intelligence, Elsevier, vol. 95(C).
    11. Andrew D. Grotzinger & Javier de la Fuente & Gail Davies & Michel G. Nivard & Elliot M. Tucker-Drob, 2022. "Transcriptome-wide and stratified genomic structural equation modeling identify neurobiological pathways shared across diverse cognitive traits," Nature Communications, Nature, vol. 13(1), pages 1-15, December.
    12. Ylenio Longo & Alexander Gunz & Guy Curtis & Tom Farsides, 2016. "Measuring Need Satisfaction and Frustration in Educational and Work Contexts: The Need Satisfaction and Frustration Scale (NSFS)," Journal of Happiness Studies, Springer, vol. 17(1), pages 295-317, February.
    13. Ng-Knight, Terry & Schoon, Ingrid, 2017. "Can locus of control compensate for socioeconomic adversity in the transition from school to work?," EconStor Open Access Articles and Book Chapters, ZBW - Leibniz Information Centre for Economics, vol. 46(10), pages 2114-2128.
    14. Baumdicker, F. & Hölker, U., 2020. "Method comparison with repeated measurements — Passing–Bablok regression for grouped data with errors in both variables," Statistics & Probability Letters, Elsevier, vol. 164(C).
    15. Max Lam & Chia-Yen Chen & W. David Hill & Charley Xia & Ruoyu Tian & Daniel F. Levey & Joel Gelernter & Murray B. Stein & Alexander S. Hatoum & Hailiang Huang & Anil K. Malhotra & Heiko Runz & Tian Ge, 2022. "Collective genomic segments with differential pleiotropic patterns between cognitive dimensions and psychopathology," Nature Communications, Nature, vol. 13(1), pages 1-22, December.
    16. Dino Collalti & Eric Strobl, 2022. "Economic damages due to extreme precipitation during tropical storms: evidence from Jamaica," Natural Hazards: Journal of the International Society for the Prevention and Mitigation of Natural Hazards, Springer;International Society for the Prevention and Mitigation of Natural Hazards, vol. 110(3), pages 2059-2086, February.
    17. Bauchmüller, Robert, 2012. "Gains from child-centred Early Childhood Education: Evidence from a Dutch pilot programme," MERIT Working Papers 2012-016, United Nations University - Maastricht Economic and Social Research Institute on Innovation and Technology (MERIT).
    18. S. C. Noah Uhrig & Nicole Watson, 2020. "The Impact of Measurement Error on Wage Decompositions: Evidence From the British Household Panel Survey and the Household, Income and Labour Dynamics in Australia Survey," Sociological Methods & Research, , vol. 49(1), pages 43-78, February.
    19. Xiaocao Tian & Huaidong Du & Liming Li & Derrick Bennett & Ruqin Gao & Shanpeng Li & Shaojie Wang & Yu Guo & Zheng Bian & Ling Yang & Yiping Chen & Junshi Chen & Yan Gao & Min Weng & Zengchang Pang & , 2017. "Fruit consumption and physical activity in relation to all-cause and cardiovascular mortality among 70,000 Chinese adults with pre-existing vascular disease," PLOS ONE, Public Library of Science, vol. 12(4), pages 1-16, April.
    20. Tuomo Hartonen & Bradley Jermy & Hanna Sõnajalg & Pekka Vartiainen & Kristi Krebs & Andrius Vabalas & Tuija Leino & Hanna Nohynek & Jonas Sivelä & Reedik Mägi & Mark Daly & Hanna M. Ollila & Lili Mila, 2023. "Nationwide health, socio-economic and genetic predictors of COVID-19 vaccination status in Finland," Nature Human Behaviour, Nature, vol. 7(7), pages 1069-1083, July.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nat:nathum:v:7:y:2023:i:7:d:10.1038_s41562-023-01579-9. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.nature.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.