IDEAS home Printed from https://ideas.repec.org/a/plo/pgen00/1011951.html
   My bibliography  Save this article

Modeling the length distribution of gene conversion tracts in humans from the UK Biobank sequence data

Author

Listed:
  • Nobuaki Masaki
  • Sharon R Browning

Abstract

Non-crossover gene conversion is a type of meiotic recombination characterized by the non-reciprocal transfer of genetic material between homologous chromosomes. Gene conversions are thought to occur within relatively short tracts of DNA. In this study, we propose a statistical method to model the length distribution of gene conversion tracts in humans, using nearly one million gene conversion tracts detected from the UK Biobank whole autosome data. To handle the large number of tracts, we designed a computationally efficient inferential framework. Our method further accounts for regional variation in the density of variant sites and heterozygosity across the genome, which can influence the observed length of gene conversion tracts. We allow for multiple candidate tract length distributions and select the best fitting distribution using the Bayesian Information Criterion (BIC). Using a mixture of two geometric components for the tract length distribution, we estimate that the smaller component has a mean of 16.9 bp (95% CI: [16.4, 17.0]), and the larger component has a mean of 724.7 bp (95% CI: [720.1, 728.7]). We further estimate the proportion of tracts from the second component to be 0.00525 (95% CI: [0.005, 0.00525]). After stratifying by crossover-hotspot overlap, we infer that tracts whose midpoints lie within crossover hotspots are, on average, longer than the remaining tracts.Author summary: Gene conversions are recombination events distinct from crossovers, in which alleles are transferred between homologous sequences within a short tract. Previous studies have investigated the lengths of gene conversion tracts using pedigree or sperm-typing data, but the number of gene conversion events that can be observed from these datasets is limited. In our study, we used almost one million detected gene conversion tracts from the ancestral history of UK Biobank participants to study the length distribution of these tracts. Within a gene conversion tract in the transmitting parent, alleles are only converted at heterozygous sites, so we cannot observe the full length of the gene conversion tract. To account for this in our method, we model the allele conversion probability separately for each detected tract. Our method allows for various distributions of the gene conversion tract length and is computationally efficient to handle all the tracts detected from the UK Biobank whole autosome data. Fitting a two-component model to shorter detected tracts that do not exceed 1.5 kb, we estimate the means of the two components to be 16.9 bp and 724.7 bp respectively. We further estimate the proportion of tracts from the second component to be 0.00525.

Suggested Citation

  • Nobuaki Masaki & Sharon R Browning, 2025. "Modeling the length distribution of gene conversion tracts in humans from the UK Biobank sequence data," PLOS Genetics, Public Library of Science, vol. 21(11), pages 1-21, November.
  • Handle: RePEc:plo:pgen00:1011951
    DOI: 10.1371/journal.pgen.1011951
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosgenetics/article?id=10.1371/journal.pgen.1011951
    Download Restriction: no

    File URL: https://journals.plos.org/plosgenetics/article/file?id=10.1371/journal.pgen.1011951&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pgen.1011951?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pgen00:1011951. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosgenetics (email available below). General contact details of provider: https://journals.plos.org/plosgenetics/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.