IDEAS home Printed from https://ideas.repec.org/a/nat/natcom/v6y2015i1d10.1038_ncomms9111.html
   My bibliography  Save this article

Improved imputation of low-frequency and rare variants using the UK10K haplotype reference panel

Author

Listed:
  • Jie Huang

    (The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus)

  • Bryan Howie

    (Adaptive Biotechnologies Corporation)

  • Shane McCarthy

    (The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus)

  • Yasin Memari

    (The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus)

  • Klaudia Walter

    (The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus)

  • Josine L. Min

    (MRC Integrative Epidemiology Unit, School of Social and Community Medicine, University of Bristol, Oakfield House, Oakfield Grove)

  • Petr Danecek

    (The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus)

  • Giovanni Malerba

    (Biology and Genetics, University of Verona)

  • Elisabetta Trabetti

    (Biology and Genetics, University of Verona)

  • Hou-Feng Zheng

    (Lady Davis Institute, Jewish General Hospital
    McGill University
    McGill University)

  • Giovanni Gambaro

    (Institute of Internal Medicine, Renal Program, Columbus-Gemelli University Hospital, Catholic University)

  • J. Brent Richards

    (Lady Davis Institute, Jewish General Hospital
    McGill University
    McGill University
    King’s College London, St Thomas’ Campus)

  • Richard Durbin

    (The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus)

  • Nicholas J. Timpson

    (MRC Integrative Epidemiology Unit, School of Social and Community Medicine, University of Bristol, Oakfield House, Oakfield Grove)

  • Jonathan Marchini

    (University of Oxford
    Wellcome Trust Centre for Human Genetics, Roosevelt Drive)

  • Nicole Soranzo

    (The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus
    University of Cambridge)

Abstract

Imputing genotypes from reference panels created by whole-genome sequencing (WGS) provides a cost-effective strategy for augmenting the single-nucleotide polymorphism (SNP) content of genome-wide arrays. The UK10K Cohorts project has generated a data set of 3,781 whole genomes sequenced at low depth (average 7x), aiming to exhaustively characterize genetic variation down to 0.1% minor allele frequency in the British population. Here we demonstrate the value of this resource for improving imputation accuracy at rare and low-frequency variants in both a UK and an Italian population. We show that large increases in imputation accuracy can be achieved by re-phasing WGS reference panels after initial genotype calling. We also present a method for combining WGS panels to improve variant coverage and downstream imputation accuracy, which we illustrate by integrating 7,562 WGS haplotypes from the UK10K project with 2,184 haplotypes from the 1000 Genomes Project. Finally, we introduce a novel approximation that maintains speed without sacrificing imputation accuracy for rare variants.

Suggested Citation

  • Jie Huang & Bryan Howie & Shane McCarthy & Yasin Memari & Klaudia Walter & Josine L. Min & Petr Danecek & Giovanni Malerba & Elisabetta Trabetti & Hou-Feng Zheng & Giovanni Gambaro & J. Brent Richards, 2015. "Improved imputation of low-frequency and rare variants using the UK10K haplotype reference panel," Nature Communications, Nature, vol. 6(1), pages 1-9, November.
  • Handle: RePEc:nat:natcom:v:6:y:2015:i:1:d:10.1038_ncomms9111
    DOI: 10.1038/ncomms9111
    as

    Download full text from publisher

    File URL: https://www.nature.com/articles/ncomms9111
    File Function: Abstract
    Download Restriction: no

    File URL: https://libkey.io/10.1038/ncomms9111?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Gerard Van Den Berg & Stephanie von Hinke & R. Adele H. Wang, 2022. "Prenatal sugar consumption and late-life human capital and health: analyses based on postwar rationing and polygenic scores," IFS Working Papers W22/38, Institute for Fiscal Studies.
    2. van den Berg, Gerard J. & von Hinke, Stephanie & H. Wang, R. Adele, 2023. "Prenatal sugar consumption and late-life human capital and health: analyses based on postwar rationing and polygenic indices," Working Paper Series 2023:5, IFAU - Institute for Evaluation of Labour Market and Education Policy.
    3. Shuyan Mei & Ali Karimnezhad & Marie Forest & David R Bickel & Celia M T Greenwood, 2017. "The performance of a new local false discovery rate method on tests of association between coronary artery disease (CAD) and genome-wide genetic variants," PLOS ONE, Public Library of Science, vol. 12(9), pages 1-14, September.
    4. van den Berg, G.J.; & von Hinke, S.; & Wang, R.A.H.;, 2023. "Prenatal Sugar Consumption and Late-Life Human Capital and Health: Analyses Based on Postwar Rationing and Polygenic Indices," Health, Econometrics and Data Group (HEDG) Working Papers 23/11, HEDG, c/o Department of Economics, University of York.
    5. Gemma L. Clayton & Maria Carolina Borges & Deborah A. Lawlor, 2024. "The impact of reproductive factors on the metabolic profile of females from menarche to menopause," Nature Communications, Nature, vol. 15(1), pages 1-16, December.
    6. Xinkai Tong & Dong Chen & Jianchao Hu & Shiyao Lin & Ziqi Ling & Huashui Ai & Zhiyan Zhang & Lusheng Huang, 2023. "Accurate haplotype construction and detection of selection signatures enabled by high quality pig genome sequences," Nature Communications, Nature, vol. 14(1), pages 1-11, December.
    7. Wenhan Chen & Yang Wu & Zhili Zheng & Ting Qi & Peter M. Visscher & Zhihong Zhu & Jian Yang, 2021. "Improved analyses of GWAS summary statistics by reducing data heterogeneity and errors," Nature Communications, Nature, vol. 12(1), pages 1-10, December.
    8. Lina Cai & Tomas Gonzales & Eleanor Wheeler & Nicola D. Kerrison & Felix R. Day & Claudia Langenberg & John R. B. Perry & Soren Brage & Nicholas J. Wareham, 2023. "Causal associations between cardiorespiratory fitness and type 2 diabetes," Nature Communications, Nature, vol. 14(1), pages 1-13, December.
    9. Maik Pietzner & Eleanor Wheeler & Julia Carrasco-Zanini & Nicola D. Kerrison & Erin Oerton & Mine Koprulu & Jian’an Luan & Aroon D. Hingorani & Steve A. Williams & Nicholas J. Wareham & Claudia Langen, 2021. "Synergistic insights into human health from aptamer- and antibody-based proteomic profiling," Nature Communications, Nature, vol. 12(1), pages 1-13, December.
    10. Pei-Kuan Cong & Wei-Yang Bai & Jin-Chen Li & Meng-Yuan Yang & Saber Khederzadeh & Si-Rui Gai & Nan Li & Yu-Heng Liu & Shi-Hui Yu & Wei-Wei Zhao & Jun-Quan Liu & Yi Sun & Xiao-Wei Zhu & Pian-Pian Zhao , 2022. "Genomic analyses of 10,376 individuals in the Westlake BioBank for Chinese (WBBC) pilot project," Nature Communications, Nature, vol. 13(1), pages 1-15, December.
    11. Rozaimi Mohamad Razali & Juan Rodriguez-Flores & Mohammadmersad Ghorbani & Haroon Naeem & Waleed Aamer & Elbay Aliyev & Ali Jubran & Andrew G. Clark & Khalid A. Fakhro & Younes Mokrab, 2021. "Thousands of Qatari genomes inform human migration history and improve imputation of Arab haplotypes," Nature Communications, Nature, vol. 12(1), pages 1-16, December.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nat:natcom:v:6:y:2015:i:1:d:10.1038_ncomms9111. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.nature.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.