IDEAS home Printed from https://ideas.repec.org/p/zbw/sfb475/200724.html
   My bibliography  Save this paper

Detecting high-order interactions of single nucleotide polymorphisms using genetic programming

Author

Listed:
  • Nunkesser, Robin
  • Bernholt, Thorsten
  • Schwender, Holger
  • Ickstadt, Katja
  • Wegener, Ing

Abstract

Motivation: Not individual single nucleotide polymorphisms (SNPs), but high-order interactions of SNPs are assumed to be responsible for complex diseases such as cancer. Therefore, one of the major goals of genetic association studies concerned with such genotype data is the identification of these high-order interactions. This search is additionally impeded by the fact that these interactions often are only explanatory for a relatively small subgroup of patients. Most of the feature selection methods proposed in the literature, unfortunately, fail at this task, since they can either only identify individual variables or interactions of a low order, or try to find rules that are explanatory for a high percentage of the observations. In this paper, we present a procedure based on genetic programming and multi-valued logic that enables the identification of high-order interactions of categorical variables such as SNPs. This method called GPAS (Genetic Programming for Association Studies) cannot only be used for feature selection, but can also be employed for discrimination. Results: In an application to the genotype data from the GENICA study, an association study concerned with sporadic breast cancer, GPAS is able to identify high-order interactions of SNPs leading to a considerably increased breast cancer risk for different subsets of patients that are not found by other feature selection methods. As an application to a subset of the HapMap data shows, GPAS is not restricted to association studies comprising several ten SNPs, but can also be employed to analyze whole-genome data.

Suggested Citation

  • Nunkesser, Robin & Bernholt, Thorsten & Schwender, Holger & Ickstadt, Katja & Wegener, Ing, 2007. "Detecting high-order interactions of single nucleotide polymorphisms using genetic programming," Technical Reports 2007,24, Technische Universität Dortmund, Sonderforschungsbereich 475: Komplexitätsreduktion in multivariaten Datenstrukturen.
  • Handle: RePEc:zbw:sfb475:200724
    as

    Download full text from publisher

    File URL: https://www.econstor.eu/bitstream/10419/36598/1/600071014.PDF
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Boulesteix Anne-Laure & Strobl Carolin & Weidinger Stefan & Wichmann H.-Erich & Wagenpfeil Stefan, 2007. "Multiple Testing for SNP-SNP Interactions," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 6(1), pages 1-24, December.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Zhong Wang & Tian Liu & Zhenwu Lin & John Hegarty & Walter A Koltun & Rongling Wu, 2010. "A General Model for Multilocus Epistatic Interactions in Case-Control Studies," PLOS ONE, Public Library of Science, vol. 5(8), pages 1-9, August.
    2. Nunkesser, Robin & Morell, Oliver, 2008. "Evolutionary algorithms for robust methods," Technical Reports 2008,29, Technische Universität Dortmund, Sonderforschungsbereich 475: Komplexitätsreduktion in multivariaten Datenstrukturen.
    3. Rocco, Claudio M. & Hernandez-Perdomo, Elvis & Mun, Johnathan, 2021. "Application of logic regression to assess the importance of interactions between components in a network," Reliability Engineering and System Safety, Elsevier, vol. 205(C).
    4. Nunkesser, Robin, 2008. "RFreak-An R-package for evolutionary computation," Technical Reports 2008,12, Technische Universität Dortmund, Sonderforschungsbereich 475: Komplexitätsreduktion in multivariaten Datenstrukturen.
    5. Schwender, Holger & Ickstadt, Katja, 2008. "Imputing missing genotypes with weighted k nearest neighbors," Technical Reports 2008,03, Technische Universität Dortmund, Sonderforschungsbereich 475: Komplexitätsreduktion in multivariaten Datenstrukturen.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Adler, Werner & Lausen, Berthold, 2009. "Bootstrap estimated true and false positive rates and ROC curve," Computational Statistics & Data Analysis, Elsevier, vol. 53(3), pages 718-729, January.
    2. Elsäßer Amelie & Victor Anja & Hommel Gerhard, 2011. "Multiple Testing in Candidate Gene Situations: A Comparison of Classical, Discrete, and Resampling-Based Procedures," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 10(1), pages 1-21, November.
    3. Malina Magdalena & Posch Martin & Ickstadt Katja & Schwender Holger & Bogdan Małgorzata, 2014. "Detection of epistatic effects with logic regression and a classical linear regression model," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 13(1), pages 83-104, February.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:zbw:sfb475:200724. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ZBW - Leibniz Information Centre for Economics (email available below). General contact details of provider: https://edirc.repec.org/data/isdorde.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.