IDEAS home Printed from https://ideas.repec.org/a/plo/pgen00/1010134.html
   My bibliography  Save this article

A spatially aware likelihood test to detect sweeps from haplotype distributions

Author

Listed:
  • Michael DeGiorgio
  • Zachary A Szpiech

Abstract

The inference of positive selection in genomes is a problem of great interest in evolutionary genomics. By identifying putative regions of the genome that contain adaptive mutations, we are able to learn about the biology of organisms and their evolutionary history. Here we introduce a composite likelihood method that identifies recently completed or ongoing positive selection by searching for extreme distortions in the spatial distribution of the haplotype frequency spectrum along the genome relative to the genome-wide expectation taken as neutrality. Furthermore, the method simultaneously infers two parameters of the sweep: the number of sweeping haplotypes and the “width” of the sweep, which is related to the strength and timing of selection. We demonstrate that this method outperforms the leading haplotype-based selection statistics, though strong signals in low-recombination regions merit extra scrutiny. As a positive control, we apply it to two well-studied human populations from the 1000 Genomes Project and examine haplotype frequency spectrum patterns at the LCT and MHC loci. We also apply it to a data set of brown rats sampled in NYC and identify genes related to olfactory perception. To facilitate use of this method, we have implemented it in user-friendly open source software.Author summary: Identifying regions of the genome that contain adaptive variation is of fundamental interest in evolutionary biology, providing insight into an organism’s history and biology. When positive selection is recent or ongoing, we expect to find genomic patterns such as high frequency haplotypes and low genetic diversity in the vicinity of the adaptive locus. Here we develop a statistic to identify these regions based on distortions of the haplotype frequency spectrum from a background distribution. We evaluate the performance of this statistic under numerous realistic settings of interest to empiricists and demonstrate its superior performance relative to other haplotype-based selection statistics. We also apply this statistic to real population-genetic data. As a positive control, we explore two well-studied loci, LCT and MHC, in a European and an African human population that show strong evidence for selection. We also apply this statistic to the genomes of an urban brown rat population, where we uncover evidence for adaptation in olfactory perception genes. We release user-friendly software implementing this statistic.

Suggested Citation

  • Michael DeGiorgio & Zachary A Szpiech, 2022. "A spatially aware likelihood test to detect sweeps from haplotype distributions," PLOS Genetics, Public Library of Science, vol. 18(4), pages 1-37, April.
  • Handle: RePEc:plo:pgen00:1010134
    DOI: 10.1371/journal.pgen.1010134
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosgenetics/article?id=10.1371/journal.pgen.1010134
    Download Restriction: no

    File URL: https://journals.plos.org/plosgenetics/article/file?id=10.1371/journal.pgen.1010134&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pgen.1010134?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Pleuni S Pennings & Joachim Hermisson, 2006. "Soft Sweeps III: The Signature of Positive Selection from Recurrent Mutation," PLOS Genetics, Public Library of Science, vol. 2(12), pages 1-15, December.
    2. Mehreen R Mughal & Hillary Koch & Jinguo Huang & Francesca Chiaromonte & Michael DeGiorgio, 2020. "Learning the properties of adaptive regions with functional data analysis," PLOS Genetics, Public Library of Science, vol. 16(8), pages 1-44, August.
    3. Jerome Kelleher & Alison M Etheridge & Gilean McVean, 2016. "Efficient Coalescent Simulation and Genealogical Analysis for Large Sample Sizes," PLOS Computational Biology, Public Library of Science, vol. 12(5), pages 1-22, May.
    4. Michael DeGiorgio & Kirk E Lohmueller & Rasmus Nielsen, 2014. "A Model-Based Approach for Identifying Signatures of Ancient Balancing Selection in Genetic Data," PLOS Genetics, Public Library of Science, vol. 10(8), pages 1-20, August.
    5. Aaron J Stern & Peter R Wilton & Rasmus Nielsen, 2019. "An approximate full-likelihood method for inferring selection and allele frequency trajectories from DNA sequence data," PLOS Genetics, Public Library of Science, vol. 15(9), pages 1-32, September.
    6. Pardis C. Sabeti & Patrick Varilly & Ben Fry & Jason Lohmueller & Elizabeth Hostetter & Chris Cotsapas & Xiaohui Xie & Elizabeth H. Byrne & Steven A. McCarroll & Rachelle Gaudet & Stephen F. Schaffner, 2007. "Genome-wide detection and characterization of positive selection in human populations," Nature, Nature, vol. 449(7164), pages 913-918, October.
    7. Thomas Derrien & Jordi Estellé & Santiago Marco Sola & David G Knowles & Emanuele Raineri & Roderic Guigó & Paolo Ribeca, 2012. "Fast Computation and Applications of Genome Mappability," PLOS ONE, Public Library of Science, vol. 7(1), pages 1-16, January.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Vasili Pankratov & Milyausha Yunusbaeva & Sergei Ryakhovsky & Maksym Zarodniuk & Bayazit Yunusbayev, 2022. "Prioritizing autoimmunity risk variants for functional analyses by fine-mapping mutations under natural selection," Nature Communications, Nature, vol. 13(1), pages 1-13, December.
    2. Yichen Zheng & Thomas Wiehe, 2019. "Adaptation in structured populations and fuzzy boundaries between hard and soft sweeps," PLOS Computational Biology, Public Library of Science, vol. 15(11), pages 1-32, November.
    3. Andrea Fulgione & Célia Neto & Ahmed F. Elfarargi & Emmanuel Tergemina & Shifa Ansari & Mehmet Göktay & Herculano Dinis & Nina Döring & Pádraic J. Flood & Sofia Rodriguez-Pacheco & Nora Walden & Marcu, 2022. "Parallel reduction in flowering time from de novo mutations enable evolutionary rescue in colonizing lineages," Nature Communications, Nature, vol. 13(1), pages 1-14, December.
    4. Kiran Krishnamachari & Dylan Lu & Alexander Swift-Scott & Anuar Yeraliyev & Kayla Lee & Weitai Huang & Sim Ngak Leng & Anders Jacobsen Skanderup, 2022. "Accurate somatic variant detection using weakly supervised deep learning," Nature Communications, Nature, vol. 13(1), pages 1-8, December.
    5. Sergio F. Nigenda-Morales & Meixi Lin & Paulina G. Nuñez-Valencia & Christopher C. Kyriazis & Annabel C. Beichman & Jacqueline A. Robinson & Aaron P. Ragsdale & Jorge Urbán R. & Frederick I. Archer & , 2023. "The genomic footprint of whaling and isolation in fin whale populations," Nature Communications, Nature, vol. 14(1), pages 1-18, December.
    6. Ralph, Peter L., 2019. "An empirical approach to demographic inference with genomic data," Theoretical Population Biology, Elsevier, vol. 127(C), pages 91-101.
    7. Zihao Wang & Wenxi Wang & Xiaoming Xie & Yongfa Wang & Zhengzhao Yang & Huiru Peng & Mingming Xin & Yingyin Yao & Zhaorong Hu & Jie Liu & Zhenqi Su & Chaojie Xie & Baoyun Li & Zhongfu Ni & Qixin Sun &, 2022. "Dispersed emergence and protracted domestication of polyploid wheat uncovered by mosaic ancestral haploblock inference," Nature Communications, Nature, vol. 13(1), pages 1-14, December.
    8. Chen, Hua & Hey, Jody & Slatkin, Montgomery, 2015. "A hidden Markov model for investigating recent positive selection through haplotype structure," Theoretical Population Biology, Elsevier, vol. 99(C), pages 18-30.
    9. Mohammad Hossein Olyaee & Alireza Khanteymoori & Khosrow Khalifeh, 2020. "A chaotic viewpoint-based approach to solve haplotype assembly using hypergraph model," PLOS ONE, Public Library of Science, vol. 15(10), pages 1-19, October.
    10. Ali Mahmoudi & Jere Koskela & Jerome Kelleher & Yao-ban Chan & David Balding, 2022. "Bayesian inference of ancestral recombination graphs," PLOS Computational Biology, Public Library of Science, vol. 18(3), pages 1-15, March.
    11. Kerdoncuff, Elise & Lambert, Amaury & Achaz, Guillaume, 2020. "Testing for population decline using maximal linkage disequilibrium blocks," Theoretical Population Biology, Elsevier, vol. 134(C), pages 171-181.
    12. Benger, Etam & Sella, Guy, 2013. "Modeling the effect of changing selective pressures on polymorphism and divergence," Theoretical Population Biology, Elsevier, vol. 85(C), pages 73-85.
    13. Lauren A. Choate & Gilad Barshad & Pierce W. McMahon & Iskander Said & Edward J. Rice & Paul R. Munn & James J. Lewis & Charles G. Danko, 2021. "Multiple stages of evolutionary change in anthrax toxin receptor expression in humans," Nature Communications, Nature, vol. 12(1), pages 1-12, December.
    14. Parul Johri & Wolfgang Stephan & Jeffrey D Jensen, 2022. "Soft selective sweeps: Addressing new definitions, evaluating competing models, and interpreting empirical outliers," PLOS Genetics, Public Library of Science, vol. 18(2), pages 1-12, February.
    15. Simone Rubinacci & Olivier Delaneau & Jonathan Marchini, 2020. "Genotype imputation using the Positional Burrows Wheeler Transform," PLOS Genetics, Public Library of Science, vol. 16(11), pages 1-19, November.
    16. Garud, Nandita R. & Rosenberg, Noah A., 2015. "Enhancing the mathematical properties of new haplotype homozygosity statistics for the detection of selective sweeps," Theoretical Population Biology, Elsevier, vol. 102(C), pages 94-101.
    17. Rouzine, Igor M. & Coffin, John M., 2010. "Multi-site adaptation in the presence of infrequent recombination," Theoretical Population Biology, Elsevier, vol. 77(3), pages 189-204.
    18. Sam Tallman & Maria das Dores Sungo & Sílvio Saranga & Sandra Beleza, 2023. "Whole genomes from Angola and Mozambique inform about the origins and dispersals of major African migrations," Nature Communications, Nature, vol. 14(1), pages 1-14, December.
    19. Victoria L. Sork & Shawn J. Cokus & Sorel T. Fitz-Gibbon & Aleksey V. Zimin & Daniela Puiu & Jesse A. Garcia & Paul F. Gugger & Claudia L. Henriquez & Ying Zhen & Kirk E. Lohmueller & Matteo Pellegrin, 2022. "High-quality genome and methylomes illustrate features underlying evolutionary success of oaks," Nature Communications, Nature, vol. 13(1), pages 1-15, December.
    20. Max Lundberg & Alexander Mackintosh & Anna Petri & Staffan Bensch, 2023. "Inversions maintain differences between migratory phenotypes of a songbird," Nature Communications, Nature, vol. 14(1), pages 1-15, December.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pgen00:1010134. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosgenetics (email available below). General contact details of provider: https://journals.plos.org/plosgenetics/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.