IDEAS home Printed from https://ideas.repec.org/a/taf/jnlasa/v112y2017i517p64-76.html

The Generalized Higher Criticism for Testing SNP-Set Effects in Genetic Association Studies

Author

Listed:
  • Ian Barnett
  • Rajarshi Mukherjee
  • Xihong Lin

Abstract

It is of substantial interest to study the effects of genes, genetic pathways, and networks on the risk of complex diseases. These genetic constructs each contain multiple SNPs, which are often correlated and function jointly, and might be large in number. However, only a sparse subset of SNPs in a genetic construct is generally associated with the disease of interest. In this article, we propose the generalized higher criticism (GHC) to test for the association between an SNP set and a disease outcome. The higher criticism is a test traditionally used in high-dimensional signal detection settings when marginal test statistics are independent and the number of parameters is very large. However, these assumptions do not always hold in genetic association studies, due to linkage disequilibrium among SNPs and the finite number of SNPs in an SNP set in each genetic construct. The proposed GHC overcomes the limitations of the higher criticism by allowing for arbitrary correlation structures among the SNPs in an SNP-set, while performing accurate analytic p-value calculations for any finite number of SNPs in the SNP-set. We obtain the detection boundary of the GHC test. We compared empirically using simulations the power of the GHC method with existing SNP-set tests over a range of genetic regions with varied correlation structures and signal sparsity. We apply the proposed methods to analyze the CGEM breast cancer genome-wide association study. Supplementary materials for this article are available online.

Suggested Citation

  • Ian Barnett & Rajarshi Mukherjee & Xihong Lin, 2017. "The Generalized Higher Criticism for Testing SNP-Set Effects in Genetic Association Studies," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 112(517), pages 64-76, January.
  • Handle: RePEc:taf:jnlasa:v:112:y:2017:i:517:p:64-76
    DOI: 10.1080/01621459.2016.1192039
    as

    Download full text from publisher

    File URL: http://hdl.handle.net/10.1080/01621459.2016.1192039
    Download Restriction: Access to full text is restricted to subscribers.

    File URL: https://libkey.io/10.1080/01621459.2016.1192039?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to

    for a different version of it.

    References listed on IDEAS

    as
    1. Armin Schwartzman & Xihong Lin, 2011. "The effect of correlation in false discovery rate estimation," Biometrika, Biometrika Trust, vol. 98(1), pages 199-214.
    2. Zhang, Yu & Liu, Jun S., 2011. "Fast and Accurate Approximation to Significance Tests in Genome-Wide Association Studies," Journal of the American Statistical Association, American Statistical Association, vol. 106(495), pages 846-857.
    3. Yu I. Ingster & Alexandre B. Tsybakov & N. Verzelzn, 2010. "Detection Boundary in Sparse Regression," Working Papers 2010-28, Center for Research in Economics and Statistics.
    4. Ian J. Barnett & Xihong Lin, 2014. "Analytical p-value calculation for the higher criticism test in finite-d problems," Biometrika, Biometrika Trust, vol. 101(4), pages 964-970.
    5. Teri A. Manolio & Francis S. Collins & Nancy J. Cox & David B. Goldstein & Lucia A. Hindorff & David J. Hunter & Mark I. McCarthy & Erin M. Ramos & Lon R. Cardon & Aravinda Chakravarti & Judy H. Cho &, 2009. "Finding the missing heritability of complex diseases," Nature, Nature, vol. 461(7265), pages 747-753, October.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Zhao, Sihai Dave & Cai, T. Tony & Li, Hongzhe, 2017. "Optimal detection of weak positive latent dependence between two sequences of multiple tests," Journal of Multivariate Analysis, Elsevier, vol. 160(C), pages 169-184.
    2. Samhita Pal & Xinge Jessie Jeng, 2026. "Discovering Candidate Genes Regulated by GWAS Signals in Cis and Trans," Statistics in Biosciences, Springer;International Chinese Statistical Association, vol. 18(1), pages 131-149, March.
    3. Zhang, Hong & Wu, Zheyang, 2022. "The general goodness-of-fit tests for correlated data," Computational Statistics & Data Analysis, Elsevier, vol. 167(C).
    4. Hong Zhang & Zheyang Wu, 2023. "The generalized Fisher's combination and accurate p‐value calculation under dependence," Biometrics, The International Biometric Society, vol. 79(2), pages 1159-1172, June.
    5. Zihan Zhao & Jianjun Zhang & Qiuying Sha & Han Hao, 2020. "Testing gene-environment interactions for rare and/or common variants in sequencing association studies," PLOS ONE, Public Library of Science, vol. 15(3), pages 1-15, March.
    6. Haque Md Rejuan & Kubatko Laura, 2024. "A global test of hybrid ancestry from genome-scale data," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 23(1), pages 1-18, January.
    7. Hébert, Florian & Causeur, David & Emily, Mathieu, 2021. "An adaptive decorrelation procedure for signal detection," Computational Statistics & Data Analysis, Elsevier, vol. 153(C).

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Hélène Tonnelé & Denghui Chen & Felipe Morillo & Jorge Garcia-Calleja & Apurva S. Chitre & Benjamin B. Johnson & Thiago Missfeldt Sanches & Riyan Cheng & Marc Jan Bonder & Antonio Gonzalez & Tomasz Ko, 2025. "Genetic architecture and mechanisms of host-microbiome interactions from a multi-cohort analysis of outbred laboratory rats," Nature Communications, Nature, vol. 16(1), pages 1-17, December.
    2. Jianqing Fan & Xu Han, 2017. "Estimation of the false discovery proportion with unknown dependence," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 79(4), pages 1143-1164, September.
    3. repec:plo:pgen00:1003258 is not listed on IDEAS
    4. Ilias Georgakopoulos-Soares & Chengyu Deng & Vikram Agarwal & Candace S. Y. Chan & Jingjing Zhao & Fumitaka Inoue & Nadav Ahituv, 2023. "Transcription factor binding site orientation and order are major drivers of gene regulatory activity," Nature Communications, Nature, vol. 14(1), pages 1-16, December.
    5. Meida Wang & Shuanglin Zhang & Qiuying Sha, 2022. "A computationally efficient clustering linear combination approach to jointly analyze multiple phenotypes for GWAS," PLOS ONE, Public Library of Science, vol. 17(4), pages 1-13, April.
    6. Ghosh Debashis, 2012. "Incorporating the Empirical Null Hypothesis into the Benjamini-Hochberg Procedure," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 11(4), pages 1-21, July.
    7. repec:plo:pone00:0083057 is not listed on IDEAS
    8. Arnab Maity & Xihong Lin, 2011. "Powerful Tests for Detecting a Gene Effect in the Presence of Possible Gene–Gene Interactions Using Garrote Kernel Machines," Biometrics, The International Biometric Society, vol. 67(4), pages 1271-1284, December.
    9. Ian W. McKeague & Min Qian, 2015. "An Adaptive Resampling Test for Detecting the Presence of Significant Predictors," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 110(512), pages 1422-1433, December.
    10. Vidya S. Farook & Feroz Akhtar & Rector Arya & Alice Yau & Srinivas Mummidi & Juan C. Lopez-Alvarenga & Alvaro Diaz-Badillo & Roy Resendez & Sharon P. Fowler & Hemant Kulkarni & Vijay Golla & Mahua Ch, 2025. "Early-Life Exposure to Organic Chemical Pollutants as Assessed in Primary Teeth and Cardiometabolic Risk in Mexican American Children: A Pilot Study," IJERPH, MDPI, vol. 22(10), pages 1-16, September.
    11. repec:plo:pone00:0070774 is not listed on IDEAS
    12. Noirrit Kiran Chandra & Sourabh Bhattacharya, 2021. "Asymptotic theory of dependent Bayesian multiple testing procedures under possible model misspecification," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 73(5), pages 891-920, October.
    13. Lin Yuan & Chang-An Yuan & De-Shuang Huang, 2017. "FAACOSE: A Fast Adaptive Ant Colony Optimization Algorithm for Detecting SNP Epistasis," Complexity, Hindawi, vol. 2017, pages 1-10, September.
    14. Lap Sum Chan & Gen Li & Eric B. Fauman & Xianyong Yin & Markku Laakso & Michael Boehnke & Peter X. K. Song, 2025. "DrFARM: identification of pleiotropic genetic variants in genome-wide association studies," Nature Communications, Nature, vol. 16(1), pages 1-14, December.
    15. Chang Lu & Jan Zaucha & Rihab Gam & Hai Fang & Smithers & Matt E. Oates & Miguel Bernabe-Rubio & James Williams & Natalie Zelenka & Arun Prasad Pandurangan & Himani Tandon & Hashem Shihab & Raju Kalai, 2023. "Hypothesis-free phenotype prediction within a genetics-first framework," Nature Communications, Nature, vol. 14(1), pages 1-14, December.
    16. repec:plo:pone00:0118701 is not listed on IDEAS
    17. Jiangzhou Wang & Pengfei Wang, 2024. "Large-scale dependent multiple testing via hidden semi-Markov models," Computational Statistics, Springer, vol. 39(3), pages 1093-1126, May.
    18. Bingxin Zhao & Fei Zou, 2022. "On polygenic risk scores for complex traits prediction," Biometrics, The International Biometric Society, vol. 78(2), pages 499-511, June.
    19. von Stumm, Sophie & Kandaswamy, Radhika & Maxwell, Jessye, 2023. "Gene-environment interplay in early life cognitive development," Intelligence, Elsevier, vol. 98(C).
    20. Xavier Castellanos-Girouard & Adrian W. R. Serohijos & Stephen W. Michnick, 2026. "Protein-protein interactions are a major source of epistasis in genetic interaction networks," Nature Communications, Nature, vol. 17(1), pages 1-14, December.
    21. Chen, Xiongzhi, 2020. "A strong law of large numbers for simultaneously testing parameters of Lancaster bivariate distributions," Statistics & Probability Letters, Elsevier, vol. 167(C).
    22. repec:plo:pgen00:1006573 is not listed on IDEAS
    23. Celine A. Manigbas & Bharati Jadhav & Paras Garg & Mariya Shadrina & William Lee & Gabrielle Altman & Alejandro Martin-Trujillo & Andrew J. Sharp, 2024. "A phenome-wide association study of tandem repeat variation in 168,554 individuals from the UK Biobank," Nature Communications, Nature, vol. 15(1), pages 1-12, December.
    24. repec:plo:pone00:0188566 is not listed on IDEAS
    25. Gareth Hawkes & Harrison I. W. Wright & Robin N. Beaumont & Kartik Chundru & Aimee Hanson & Leigh Jackson & Anna Murray & Kashyap Patel & Timothy M. Frayling & Caroline F. Wright & Andrew R. Wood & Mi, 2026. "Whole-genome sequencing analysis of anthropometric traits in 672,976 individuals reveals convergence between rare and common genetic associations," Nature Communications, Nature, vol. 17(1), pages 1-11, December.
    26. Anders Bredahl Kock & David Preinerstorfer, 2021. "Superconsistency of Tests in High Dimensions," Papers 2106.03700, arXiv.org, revised Jan 2022.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:taf:jnlasa:v:112:y:2017:i:517:p:64-76. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Chris Longhurst (email available below). General contact details of provider: http://www.tandfonline.com/UASA20 .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.