IDEAS home Printed from https://ideas.repec.org/a/eee/thpobi/v164y2025icp1-11.html
   My bibliography  Save this article

Mathematical bounds on r2 and the effect size in case-control genome-wide association studies

Author

Listed:
  • Paye, Sanjana M.
  • Edge, Michael D.

Abstract

Case-control genome-wide association studies (GWAS) are often used to find associations between genetic variants and diseases. When case-control GWAS are conducted, researchers must make decisions regarding how many cases and how many controls to include in the study. Connections between variants and diseases are made using association statistics, including χ2. Previous work in population genetics has shown that LD statistics, including r2, are bounded by the allele frequencies in the population being studied. Since varying the case fraction changes sample allele frequencies, we use the known bounds on r2 to explore how the fraction of cases included in a study can affect statistical power to detect associations. We analyze a simple mathematical model and use simulations to study a quantity proportional to the χ2 noncentrality parameter, which is closely related to r2, under various conditions. Varying the case fraction changes the χ2 noncentrality parameter, and by extension the statistical power, with effects depending on the dominance, penetrance, and frequency of the risk allele. Our framework explains previously observed results, such as asymmetries in power to detect risk vs. protective alleles, and the fact that a balanced sample of cases and controls does not always give the best power to detect associations, particularly for highly penetrant minor risk alleles that are either dominant or recessive. We show by simulation that our results can be used as a rough guide to statistical power for association tests other than χ2 tests of independence.

Suggested Citation

  • Paye, Sanjana M. & Edge, Michael D., 2025. "Mathematical bounds on r2 and the effect size in case-control genome-wide association studies," Theoretical Population Biology, Elsevier, vol. 164(C), pages 1-11.
  • Handle: RePEc:eee:thpobi:v:164:y:2025:i:c:p:1-11
    DOI: 10.1016/j.tpb.2025.04.003
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0040580925000280
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.tpb.2025.04.003?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:thpobi:v:164:y:2025:i:c:p:1-11. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: https://www.sciencedirect.com/journal/theoretical-population-biology .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.