IDEAS home Printed from https://ideas.repec.org/a/plo/pgen00/1011020.html
   My bibliography  Save this article

BRASS: Permutation methods for binary traits in genetic association studies with structured samples

Author

Listed:
  • Joelle Mbatchou
  • Mark Abney
  • Mary Sara McPeek

Abstract

In genetic association analysis of complex traits, permutation testing can be a valuable tool for assessing significance when the distribution of the test statistic is unknown or not well-approximated. This commonly arises, e.g, in tests of gene-set, pathway or genome-wide significance, or when the statistic is formed by machine learning or data adaptive methods. Existing applications include eQTL mapping, association testing with rare variants, inclusion of admixed individuals in genetic association analysis, and epistasis detection among many others. For genetic association testing in samples with population structure and/or relatedness, use of naive permutation can lead to inflated type 1 error. To address this in quantitative traits, the MVNpermute method was developed. However, for association mapping of a binary trait, the relationship between the mean and variance makes both naive permutation and the MVNpermute method invalid. We propose BRASS, a permutation method for binary traits, for use in association mapping in structured samples. In addition to modeling structure in the sample, BRASS allows for covariates, ascertainment and simultaneous testing of multiple markers, and it accommodates a wide range of test statistics. In simulation studies, we compare BRASS to other permutation and resampling-based methods in a range of scenarios that include population structure, familial relatedness, ascertainment and phenotype model misspecification. In these settings, we demonstrate the superior control of type 1 error by BRASS compared to the other 6 methods considered. We apply BRASS to assess genome-wide significance for association analyses in domestic dog for elbow dysplasia (ED) and idiopathic epilepsy (IE). For both traits we detect previously identified associations, and in addition, for ED, we detect significant association with a SNP on chromosome 35 that was not detected by previous analyses, demonstrating the potential of the method.Author summary: To determine whether genetic association with a trait is significant, permutation methods are an attractive and popular approach when analytic methods based on distributional assumptions are not available, e.g., when applying machine learning or data adaptive methods, or when performing a multiple testing correction, e.g., to assess region-wide or genome-wide significance in association mapping studies. Existing applications include eQTL mapping, association testing with rare variants, inclusion of admixed individuals in genetic association analysis, and detection of genetic interaction among many others. However, when there is population structure in the sample, naive permutation of the data can lead to inflated significance of the association results. For continuous traits, linear mixed-model based approaches have been proposed for permutation-based tests that can also adjust for sample structure; however, these do not remain valid when applied to binary traits, as key features of binary data are not well accounted for. We propose BRASS, a permutation-based testing method for binary data that incorporates important characteristics of binary data in the trait model, can accommodate relevant covariates and ascertainment, and adjusts for the presence of structure in the sample. In simulations, we demonstrate the superior control of type 1 error by BRASS compared to other methods, and we apply BRASS in the context of correcting for multiple testing in two genome-wide association studies in domestic dog: one for elbow dysplasia and one for idiopathic epilepsy.

Suggested Citation

  • Joelle Mbatchou & Mark Abney & Mary Sara McPeek, 2023. "BRASS: Permutation methods for binary traits in genetic association studies with structured samples," PLOS Genetics, Public Library of Science, vol. 19(11), pages 1-21, November.
  • Handle: RePEc:plo:pgen00:1011020
    DOI: 10.1371/journal.pgen.1011020
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosgenetics/article?id=10.1371/journal.pgen.1011020
    Download Restriction: no

    File URL: https://journals.plos.org/plosgenetics/article/file?id=10.1371/journal.pgen.1011020&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pgen.1011020?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Sophia Rabe-Hesketh & Anders Skrondal & Andrew Pickles, 2002. "Reliable estimation of generalized linear mixed models using adaptive quadrature," Stata Journal, StataCorp LLC, vol. 2(1), pages 1-21, February.
    2. Germán Rodríguez & Noreen Goldman, 2001. "Improved estimation procedures for multilevel models with binary response: a case‐study," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 164(2), pages 339-355.
    3. Matteo Bianchi & Stina Dahlgren & Jonathan Massey & Elisabeth Dietschi & Marcin Kierczak & Martine Lund-Ziener & Katarina Sundberg & Stein Istre Thoresen & Olle Kämpe & Göran Andersson & William E R O, 2015. "A Multi-Breed Genome-Wide Association Analysis for Canine Hypothyroidism Identifies a Shared Major Risk Locus on CFA12," PLOS ONE, Public Library of Science, vol. 10(8), pages 1-16, August.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Rabe-Hesketh, Sophia & Skrondal, Anders & Pickles, Andrew, 2005. "Maximum likelihood estimation of limited and discrete dependent variable models with nested random effects," Journal of Econometrics, Elsevier, vol. 128(2), pages 301-323, October.
    2. repec:ebl:ecbull:v:3:y:2008:i:42:p:1-13 is not listed on IDEAS
    3. Jan Brenner, 2007. "Parental Impact on Attitude Formation - A Siblings Study on Worries about Immigration," Ruhr Economic Papers 0022, Rheinisch-Westfälisches Institut für Wirtschaftsforschung, Ruhr-Universität Bochum, Universität Dortmund, Universität Duisburg-Essen.
    4. Weible, Daniela & Salamon, Petra & Christoph-Schulz, Inken B. & Peter, Guenter, 2013. "How do political, individual and contextual factors affect school milk demand? Empirical evidence from primary schools in Germany," Food Policy, Elsevier, vol. 43(C), pages 148-158.
    5. Wenjia Zhang & Ming Zhang, 2018. "Incorporating land use and pricing policies for reducing car dependence: Analytical framework and empirical evidence," Urban Studies, Urban Studies Journal Limited, vol. 55(13), pages 3012-3033, October.
    6. Peter Sivey, 2012. "The effect of waiting time and distance on hospital choice for English cataract patients," Health Economics, John Wiley & Sons, Ltd., vol. 21(4), pages 444-456, April.
    7. Guillaume Horny & Dragana Djurdjevic & Bernhard Boockmann & François Laisney, 2008. "Bayesian Estimation of Cox Models with Non-nested Random Effects: an Application to the Ratification Of ILO Conventions by Developing Countries," Annals of Economics and Statistics, GENES, issue 89, pages 193-214.
    8. Marino, Maria Francesca & Alfó, Marco, 2016. "Gaussian quadrature approximations in mixed hidden Markov models for longitudinal data: A simulation study," Computational Statistics & Data Analysis, Elsevier, vol. 94(C), pages 193-209.
    9. Bambio, Yiriyibin & Bouayad Agha, Salima, 2018. "Land tenure security and investment: Does strength of land right really matter in rural Burkina Faso?," World Development, Elsevier, vol. 111(C), pages 130-147.
    10. Xia, Ye-Mao & Tang, Nian-Sheng & Gou, Jian-Wei, 2016. "Generalized linear latent models for multivariate longitudinal measurements mixed with hidden Markov models," Journal of Multivariate Analysis, Elsevier, vol. 152(C), pages 259-275.
    11. Massimiliano Bratti & Alfonso Miranda, 2010. "Non‐pecuniary returns to higher education: the effect on smoking intensity in the UK," Health Economics, John Wiley & Sons, Ltd., vol. 19(8), pages 906-920, August.
    12. Björn Andersson & Tao Xin, 2021. "Estimation of Latent Regression Item Response Theory Models Using a Second-Order Laplace Approximation," Journal of Educational and Behavioral Statistics, , vol. 46(2), pages 244-265, April.
    13. Sun-Joo Cho & Paul Boeck & Susan Embretson & Sophia Rabe-Hesketh, 2014. "Additive Multilevel Item Structure Models with Random Residuals: Item Modeling for Explanation and Item Generation," Psychometrika, Springer;The Psychometric Society, vol. 79(1), pages 84-104, January.
    14. Stephen Schilling & R. Bock, 2005. "High-dimensional maximum marginal likelihood item factor analysis by adaptive quadrature," Psychometrika, Springer;The Psychometric Society, vol. 70(3), pages 533-555, September.
    15. Nicolette D. Manglos-Weber, 2017. "Religious Transformations and Generalized Trust in Sub-Saharan Africa," Social Indicators Research: An International and Interdisciplinary Journal for Quality-of-Life Measurement, Springer, vol. 133(2), pages 579-599, September.
    16. Aksoy, Ozan, 2019. "Social identity and social value orientations," SocArXiv 83rzv, Center for Open Science.
    17. Stanislav Kolenikov, 2009. "Confirmatory factor analysis using confa," Stata Journal, StataCorp LLC, vol. 9(3), pages 329-373, September.
    18. Emran, M. Shahe & Shilpi, Forhad, 2015. "Gender, Geography, and Generations: Intergenerational Educational Mobility in Post-Reform India," World Development, Elsevier, vol. 72(C), pages 362-380.
    19. Jokela, Markus & Kivimäki, Mika & Elovainio, Marko & Viikari, Jorma & Raitakari, Olli T. & Keltikangas-Järvinen, Liisa, 2009. "Urban/rural differences in body weight: Evidence for social selection and causation hypotheses in Finland," Social Science & Medicine, Elsevier, vol. 68(5), pages 867-875, March.
    20. Getinet A. Haile, 2015. "Workplace Job Satisfaction in Britain: Evidence from Linked Employer–Employee Data," LABOUR, CEIS, vol. 29(3), pages 225-242, September.
    21. Lee, Yongwoong & Rösch, Daniel & Scheule, Harald, 2016. "Accuracy of mortgage portfolio risk forecasts during financial crises," European Journal of Operational Research, Elsevier, vol. 249(2), pages 440-456.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pgen00:1011020. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosgenetics (email available below). General contact details of provider: https://journals.plos.org/plosgenetics/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.