IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0161719.html
   My bibliography  Save this article

Optimal Design of Low-Density SNP Arrays for Genomic Prediction: Algorithm and Applications

Author

Listed:
  • Xiao-Lin Wu
  • Jiaqi Xu
  • Guofei Feng
  • George R Wiggans
  • Jeremy F Taylor
  • Jun He
  • Changsong Qian
  • Jiansheng Qiu
  • Barry Simpson
  • Jeremy Walker
  • Stewart Bauck

Abstract

Low-density (LD) single nucleotide polymorphism (SNP) arrays provide a cost-effective solution for genomic prediction and selection, but algorithms and computational tools are needed for the optimal design of LD SNP chips. A multiple-objective, local optimization (MOLO) algorithm was developed for design of optimal LD SNP chips that can be imputed accurately to medium-density (MD) or high-density (HD) SNP genotypes for genomic prediction. The objective function facilitates maximization of non-gap map length and system information for the SNP chip, and the latter is computed either as locus-averaged (LASE) or haplotype-averaged Shannon entropy (HASE) and adjusted for uniformity of the SNP distribution. HASE performed better than LASE with ≤1,000 SNPs, but required considerably more computing time. Nevertheless, the differences diminished when >5,000 SNPs were selected. Optimization was accomplished conditionally on the presence of SNPs that were obligated to each chromosome. The frame location of SNPs on a chip can be either uniform (evenly spaced) or non-uniform. For the latter design, a tunable empirical Beta distribution was used to guide location distribution of frame SNPs such that both ends of each chromosome were enriched with SNPs. The SNP distribution on each chromosome was finalized through the objective function that was locally and empirically maximized. This MOLO algorithm was capable of selecting a set of approximately evenly-spaced and highly-informative SNPs, which in turn led to increased imputation accuracy compared with selection solely of evenly-spaced SNPs. Imputation accuracy increased with LD chip size, and imputation error rate was extremely low for chips with ≥3,000 SNPs. Assuming that genotyping or imputation error occurs at random, imputation error rate can be viewed as the upper limit for genomic prediction error. Our results show that about 25% of imputation error rate was propagated to genomic prediction in an Angus population. The utility of this MOLO algorithm was also demonstrated in a real application, in which a 6K SNP panel was optimized conditional on 5,260 obligatory SNP selected based on SNP-trait association in U.S. Holstein animals. With this MOLO algorithm, both imputation error rate and genomic prediction error rate were minimal.

Suggested Citation

  • Xiao-Lin Wu & Jiaqi Xu & Guofei Feng & George R Wiggans & Jeremy F Taylor & Jun He & Changsong Qian & Jiansheng Qiu & Barry Simpson & Jeremy Walker & Stewart Bauck, 2016. "Optimal Design of Low-Density SNP Arrays for Genomic Prediction: Algorithm and Applications," PLOS ONE, Public Library of Science, vol. 11(9), pages 1-36, September.
  • Handle: RePEc:plo:pone00:0161719
    DOI: 10.1371/journal.pone.0161719
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0161719
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0161719&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pone.0161719?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Charles‐Elie Rabier & Simona Grusea, 2021. "Prediction in high‐dimensional linear models and application to genomic selection under imperfect linkage disequilibrium," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 70(4), pages 1001-1026, August.
    2. Ruihan Mao & Lei Zhou & Zhaojun Wang & Jianliang Wu & Jianfeng Liu, 2023. "A Comprehensive Strategy Combining Feature Selection and Local Optimization Algorithm to Optimize the Design of Low-Density Chip for Genomic Selection," Agriculture, MDPI, vol. 13(3), pages 1-11, March.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0161719. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.