IDEAS home Printed from https://ideas.repec.org/a/plo/pcbi00/1004139.html

PRIMAL: Fast and Accurate Pedigree-based Imputation from Sequence Data in a Founder Population

Author

Listed:
  • Oren E Livne
  • Lide Han
  • Gorka Alkorta-Aranburu
  • William Wentworth-Sheilds
  • Mark Abney
  • Carole Ober
  • Dan L Nicolae

Abstract

Founder populations and large pedigrees offer many well-known advantages for genetic mapping studies, including cost-efficient study designs. Here, we describe PRIMAL (PedigRee IMputation ALgorithm), a fast and accurate pedigree-based phasing and imputation algorithm for founder populations. PRIMAL incorporates both existing and original ideas, such as a novel indexing strategy of Identity-By-Descent (IBD) segments based on clique graphs. We were able to impute the genomes of 1,317 South Dakota Hutterites, who had genome-wide genotypes for ~300,000 common single nucleotide variants (SNVs), from 98 whole genome sequences. Using a combination of pedigree-based and LD-based imputation, we were able to assign 87% of genotypes with >99% accuracy over the full range of allele frequencies. Using the IBD cliques we were also able to infer the parental origin of 83% of alleles, and genotypes of deceased recent ancestors for whom no genotype information was available. This imputed data set will enable us to better study the relative contribution of rare and common variants on human phenotypes, as well as parental origin effect of disease risk alleles in >1,000 individuals at minimal cost.Author Summary: The recent availability of whole genome and whole exome sequencing allows genetic studies of human diseases and traits at an unprecedented resolution, although their cost limits the size of the studied sample. To overcome this limitation and design cost-efficient studies, we developed a two step method: sequencing of relatively few members of a well-characterized founder population followed by pedigree-based whole genome imputation of many other individuals with genome-wide genotype data. We show that by sequencing only 98 Hutterites, we can impute 7 million variants in an additional 1,317 Hutterites with >99% accuracy and an average call rate of 87%. Furthermore, parental origin was assigned to 83% of the alleles. Such studies in the Hutterites and other founder populations should yield new insights into the genetic architecture of common diseases, gene expression traits, and clinically relevant biomarkers of disease, and ultimately provide outstanding opportunities for personalized medicine in these well-characterized populations.

Suggested Citation

  • Oren E Livne & Lide Han & Gorka Alkorta-Aranburu & William Wentworth-Sheilds & Mark Abney & Carole Ober & Dan L Nicolae, 2015. "PRIMAL: Fast and Accurate Pedigree-based Imputation from Sequence Data in a Founder Population," PLOS Computational Biology, Public Library of Science, vol. 11(3), pages 1-14, March.
  • Handle: RePEc:plo:pcbi00:1004139
    DOI: 10.1371/journal.pcbi.1004139
    as

    Download full text from publisher

    File URL: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1004139
    Download Restriction: no

    File URL: https://journals.plos.org/ploscompbiol/article/file?id=10.1371/journal.pcbi.1004139&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pcbi.1004139?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Bryan N Howie & Peter Donnelly & Jonathan Marchini, 2009. "A Flexible and Accurate Genotype Imputation Method for the Next Generation of Genome-Wide Association Studies," PLOS Genetics, Public Library of Science, vol. 5(6), pages 1-15, June.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Amy Ko & Rasmus Nielsen, 2017. "Composite likelihood method for inferring local pedigrees," PLOS Genetics, Public Library of Science, vol. 13(8), pages 1-21, August.
    2. Esther Ulitzsch & Qiwei He & Vincent Ulitzsch & Hendrik Molter & André Nichterlein & Rolf Niedermeier & Steffi Pohl, 2021. "Combining Clickstream Analyses and Graph-Modeled Data Clustering for Identifying Common Response Processes," Psychometrika, Springer;The Psychometric Society, vol. 86(1), pages 190-214, March.
    3. Mark Reppell & John Novembre, 2018. "Using pseudoalignment and base quality to accurately quantify microbial community composition," PLOS Computational Biology, Public Library of Science, vol. 14(4), pages 1-23, April.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Daniel Svensson & Matilda Rentoft & Anna M Dahlin & Emma Lundholm & Pall I Olason & Andreas Sjödin & Carin Nylander & Beatrice S Melin & Johan Trygg & Erik Johansson, 2020. "A whole-genome sequenced control population in northern Sweden reveals subregional genetic differences," PLOS ONE, Public Library of Science, vol. 15(9), pages 1-18, September.
    2. Chuan Gao & Nan Wang & Xiuqing Guo & Julie T Ziegler & Kent D Taylor & Anny H Xiang & Yang Hai & Steven J Kridel & Jerry L Nadler & Fouad Kandeel & Leslie J Raffel & Yii-Der I Chen & Jill M Norris & J, 2015. "A Comprehensive Analysis of Common and Rare Variants to Identify Adiposity Loci in Hispanic Americans: The IRAS Family Study (IRASFS)," PLOS ONE, Public Library of Science, vol. 10(11), pages 1-17, November.
    3. Poojitha Balakrishnan & Miranda R. Jones & Dhananjay Vaidya & Maria Tellez-Plaza & Wendy S. Post & Joel D. Kaufman & Suzette J. Bielinski & Kent Taylor & Kevin Francesconi & Walter Goessler & Ana Nava, 2018. "Ethnic, Geographic, and Genetic Differences in Arsenic Metabolism at Low Arsenic Exposure: A Preliminary Analysis in the Multi-Ethnic Study of Atherosclerosis (MESA)," IJERPH, MDPI, vol. 15(6), pages 1-11, June.
    4. Wei-Yu Lin & Ian W Brock & Dan Connley & Helen Cramp & Rachel Tucker & Jon Slate & Malcolm W R Reed & Sabapathy P Balasubramanian & Lisa A Cannon-Albright & Nicola J Camp & Angela Cox, 2013. "Associations of ATR and CHEK1 Single Nucleotide Polymorphisms with Breast Cancer," PLOS ONE, Public Library of Science, vol. 8(7), pages 1-1, July.
    5. Craig, Sarah J.C. & Kenney, Ana M. & Lin, Junli & Paul, Ian M. & Birch, Leann L. & Savage, Jennifer S. & Marini, Michele E. & Chiaromonte, Francesca & Reimherr, Matthew L. & Makova, Kateryna D., 2023. "Constructing a polygenic risk score for childhood obesity using functional data analysis," Econometrics and Statistics, Elsevier, vol. 25(C), pages 66-86.
    6. Anshuman Sewda & A J Agopian & Elizabeth Goldmuntz & Hakon Hakonarson & Bernice E Morrow & Deanne Taylor & Laura E Mitchell & on behalf of the Pediatric Cardiac Genomics Consortium, 2019. "Gene-based genome-wide association studies and meta-analyses of conotruncal heart defects," PLOS ONE, Public Library of Science, vol. 14(7), pages 1-19, July.
    7. Harriëtte Riese & Loretto M Muñoz & Catharina A Hartman & Xiuhua Ding & Shaoyong Su & Albertine J Oldehinkel & Arie M van Roon & Peter J van der Most & Joop Lefrandt & Ron T Gansevoort & Pim van der H, 2014. "Identifying Genetic Variants for Heart Rate Variability in the Acetylcholine Pathway," PLOS ONE, Public Library of Science, vol. 9(11), pages 1-9, November.
    8. Franz Förster & David Emmert & Katrin Horn & Janne Pott & Johannes Frasnelli & Mohammed Aslam Imtiaz & Konstantinos Melas & Valentina Talevi & Honglei Chen & Christoph Engel & Michele Filosi & Myriam , 2025. "Genome-wide association meta-analysis of human olfactory identification discovers sex-specific and sex-differential genetic variants," Nature Communications, Nature, vol. 16(1), pages 1-15, December.
    9. Indra Adrianto & Chee Paul Lin & Jessica J Hale & Albert M Levin & Indrani Datta & Ryan Parker & Adam Adler & Jennifer A Kelly & Kenneth M Kaufman & Christopher J Lessard & Kathy L Moser & Robert P Ki, 2012. "Genome-Wide Association Study of African and European Americans Implicates Multiple Shared and Ethnic Specific Loci in Sarcoidosis Susceptibility," PLOS ONE, Public Library of Science, vol. 7(8), pages 1-10, August.
    10. Bárbara Sousa da Mota & Simone Rubinacci & Diana Ivette Cruz Dávalos & Carlos Eduardo G. Amorim & Martin Sikora & Niels N. Johannsen & Marzena H. Szmyt & Piotr Włodarczak & Anita Szczepanek & Marcin M, 2023. "Imputation of ancient human genomes," Nature Communications, Nature, vol. 14(1), pages 1-17, December.
    11. Paul S de Vries & Maria Sabater-Lleal & Daniel I Chasman & Stella Trompet & Tarunveer S Ahluwalia & Alexander Teumer & Marcus E Kleber & Ming-Huei Chen & Jie Jin Wang & John R Attia & Riccardo E Mario, 2017. "Comparison of HapMap and 1000 Genomes Reference Panels in a Large-Scale Genome-Wide Association Study," PLOS ONE, Public Library of Science, vol. 12(1), pages 1-22, January.
    12. Akinori Miyashita & Asako Koike & Gyungah Jun & Li-San Wang & Satoshi Takahashi & Etsuro Matsubara & Takeshi Kawarabayashi & Mikio Shoji & Naoki Tomita & Hiroyuki Arai & Takashi Asada & Yasuo Harigaya, 2013. "SORL1 Is Genetically Associated with Late-Onset Alzheimer’s Disease in Japanese, Koreans and Caucasians," PLOS ONE, Public Library of Science, vol. 8(4), pages 1-11, April.
    13. Bo Jiang & Jun S. Liu, 2015. "Bayesian Partition Models for Identifying Expression Quantitative Trait Loci," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 110(512), pages 1350-1361, December.
    14. Reuben, Aaron & Arseneault, Louise & Belsky, Daniel W. & Caspi, Avshalom & Fisher, Helen L. & Houts, Renate M. & Moffitt, Terrie E. & Odgers, Candice, 2019. "Residential neighborhood greenery and children's cognitive development," Social Science & Medicine, Elsevier, vol. 230(C), pages 271-279.
    15. Rakesh Chettier & Lesa Nelson & James W Ogilvie & Hans M Albertsen & Kenneth Ward, 2015. "Haplotypes at LBX1 Have Distinct Inheritance Patterns with Opposite Effects in Adolescent Idiopathic Scoliosis," PLOS ONE, Public Library of Science, vol. 10(2), pages 1-11, February.
    16. Michel S. Naslavsky & Marilia O. Scliar & Guilherme L. Yamamoto & Jaqueline Yu Ting Wang & Stepanka Zverinova & Tatiana Karp & Kelly Nunes & José Ricardo Magliocco Ceroni & Diego Lima Carvalho & Carlo, 2022. "Whole-genome sequencing of 1,171 elderly admixed individuals from Brazil," Nature Communications, Nature, vol. 13(1), pages 1-11, December.
    17. Emily Mathieu, 2016. "AGGrEGATOr: A Gene-based GEne-Gene interActTiOn test for case-control association studies," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 15(2), pages 151-171, April.
    18. Gavin Band & Quang Si Le & Luke Jostins & Matti Pirinen & Katja Kivinen & Muminatou Jallow & Fatoumatta Sisay-Joof & Kalifa Bojang & Margaret Pinder & Giorgio Sirugo & David J Conway & Vysaul Nyirongo, 2013. "Imputation-Based Meta-Analysis of Severe Malaria in Three African Populations," PLOS Genetics, Public Library of Science, vol. 9(5), pages 1-13, May.
    19. Qiliang Ding & Matthew M. Edwards & Ning Wang & Xiang Zhu & Alexa N. Bracci & Michelle L. Hulke & Ya Hu & Yao Tong & Joyce Hsiao & Christine J. Charvet & Sulagna Ghosh & Robert E. Handsaker & Kevin Eg, 2021. "The genetic architecture of DNA replication timing in human pluripotent stem cells," Nature Communications, Nature, vol. 12(1), pages 1-18, December.
    20. Steinrücken, Matthias & Paul, Joshua S. & Song, Yun S., 2013. "A sequentially Markov conditional sampling distribution for structured populations with migration and recombination," Theoretical Population Biology, Elsevier, vol. 87(C), pages 51-61.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcbi00:1004139. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ploscompbiol (email available below). General contact details of provider: https://journals.plos.org/ploscompbiol/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.