IDEAS home Printed from https://ideas.repec.org/a/bpj/sagmbi/v3y2004i1n26.html
   My bibliography  Save this article

Increasing Power for Tests of Genetic Association in the Presence of Phenotype and/or Genotype Error by Use of Double-Sampling

Author

Listed:
  • Gordon Derek

    (Rockefeller University)

  • Yang Yaning

    (Rockefeller University)

  • Haynes Chad

    (Rockefeller University)

  • Finch Stephen J

    (Stony Brook University)

  • Mendell Nancy R

    (Stony Brook University)

  • Brown Abraham M

    (Burke Medical Research Institute)

  • Haroutunian Vahram

    (Mount Sinai School of Medicine)

Abstract

Phenotype and/or genotype misclassification can: significantly increase type II error probabilities for genetic case/control association, causing decrease in statistical power; and produce inaccurate estimates of population frequency parameters. We present a method, the likelihood ratio test allowing for errors (LRTae) that incorporates double-sample information for phenotypes and/or genotypes on a sub-sample of cases/controls. Population frequency parameters and misclassification probabilities are determined using a double-sample procedure as implemented in the Expectation-Maximization (EM) method. We perform null simulations assuming a SNP marker or a 4-allele (multi-allele) marker locus. To compare our method with the standard method that makes no adjustment for errors (LRTstd), we perform power simulations using a 2^k factorial design with high and low settings of: case/control samples, phenotype/genotype costs, double-sampled phenotypes/genotypes costs, phenotype/genotype error, and proportions of double-sampled individuals. All power simulations are performed fixing equal costs for the LRTstd and LRTae methods. We also consider case/control ApoE genotype data for an actual Alzheimer's study.The LRTae method maintains correct type I error proportions for all null simulations and all significance level thresholds (10%, 5%, 1%). LRTae average estimates of population frequencies and misclassification probabilities are equal to the true values, with variances of 10e-7 to 10e-8. For power simulations, the median power difference LRTae-LRTstd at the 5% significance level is 0.06 for multi-allele data and 0.01 for SNP data. For the ApoE data example, the LRTae and LRTstd p-values are 5.8 x 10e-5 and 1.6 x 10e-3, respectively. The increase in significance is due to adjustment in the LRTae for misclassification of the most commonly reported risk allele. We have developed freely available software that performs our LRTae statistic.

Suggested Citation

  • Gordon Derek & Yang Yaning & Haynes Chad & Finch Stephen J & Mendell Nancy R & Brown Abraham M & Haroutunian Vahram, 2004. "Increasing Power for Tests of Genetic Association in the Presence of Phenotype and/or Genotype Error by Use of Double-Sampling," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 3(1), pages 1-35, October.
  • Handle: RePEc:bpj:sagmbi:v:3:y:2004:i:1:n:26
    DOI: 10.2202/1544-6115.1085
    as

    Download full text from publisher

    File URL: https://doi.org/10.2202/1544-6115.1085
    Download Restriction: For access to full text, subscription to the journal or payment for the individual article is required.

    File URL: https://libkey.io/10.2202/1544-6115.1085?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Attia John & Thakkinstian Ammarin & McElduff Patrick & Milne Elizabeth & Dawson Somer & Scott Rodney J & Klerk Nicholas de & Armstrong Bruce & Thompson John, 2010. "Detecting Genotyping Error Using Measures of Degree of Hardy-Weinberg Disequilibrium," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 9(1), pages 1-17, January.
    2. Borchers Bryce & Brown Marshall & McLellan Brian & Bekmetjev Airat & Tintle Nathan L, 2009. "Incorporating Duplicate Genotype Data into Linear Trend Tests of Genetic Association: Methods and Cost-Effectiveness," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 8(1), pages 1-20, May.
    3. Fridley, Brooke L. & Turner, Stephen T. & Chapman, Arlene B. & Rodin, Andrei S. & Boerwinkle, Eric & Bailey, Kent R., 2008. "Reproducibility of genotypes as measured by the affymetrix GeneChip® 100K Human Mapping Array set," Computational Statistics & Data Analysis, Elsevier, vol. 52(12), pages 5367-5374, August.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bpj:sagmbi:v:3:y:2004:i:1:n:26. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Peter Golla (email available below). General contact details of provider: https://www.degruyter.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.