IDEAS home Printed from https://ideas.repec.org/p/ecl/stabus/4031.html
   My bibliography  Save this paper

BONuS: Multiple Multivariate Testing with a Data-Adaptive Test Statistic

Author

Listed:
  • Yang, Chiao-Yu

    (UC Berkeley)

  • Lei, Lihua

    (Stanford U)

  • Ho, Nhat

    (UT Austin)

  • Fithian, William

    (UC Berkeley)

Abstract

We propose a new adaptive empirical Bayes framework, the Bag-Of-Null-Statistics (BONuS) procedure, for multiple testing where each hypothesis testing problem is itself multivariate or nonparametric. BONuS is an adaptive and interactive knockoff-type method that helps improve the testing power while controlling the false discovery rate (FDR), and is closely connected to the "counting knockoffs" procedure analyzed in Weinstein et al. (2017). Contrary to procedures that start with a p-value for each hypothesis, our method analyzes the entire data set to adaptively estimate an optimal p-value transform based on an empirical Bayes model. Despite the extra adaptivity, our method controls FDR in finite samples even if the empirical Bayes model is incorrect or the estimation is poor. An extension, the Double BONuS procedure, validates the empirical Bayes model to guard against power loss due to model misspecification.

Suggested Citation

  • Yang, Chiao-Yu & Lei, Lihua & Ho, Nhat & Fithian, William, 2022. "BONuS: Multiple Multivariate Testing with a Data-Adaptive Test Statistic," Research Papers 4031, Stanford University, Graduate School of Business.
  • Handle: RePEc:ecl:stabus:4031
    DOI: 10.48550/arXiv.2106.15743
    as

    Download full text from publisher

    File URL: https://doi.org/10.48550/arXiv.2106.15743
    Download Restriction: no

    File URL: https://libkey.io/10.48550/arXiv.2106.15743?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Tanya M. Teslovich & Kiran Musunuru & Albert V. Smith & Andrew C. Edmondson & Ioannis M. Stylianou & Masahiro Koseki & James P. Pirruccello & Samuli Ripatti & Daniel I. Chasman & Cristen J. Willer & C, 2010. "Biological, clinical and population relevance of 95 loci for blood lipids," Nature, Nature, vol. 466(7307), pages 707-713, August.
    2. Adam E. Locke & Bratati Kahali & Sonja I. Berndt & Anne E. Justice & Tune H. Pers & Felix R. Day & Corey Powell & Sailaja Vedantam & Martin L. Buchkovich & Jian Yang & Damien C. Croteau-Chonka & Tonu , 2015. "Genetic studies of body mass index yield new insights for obesity biology," Nature, Nature, vol. 518(7538), pages 197-206, February.
    3. Lihua Lei & William Fithian, 2018. "AdaPT: an interactive procedure for multiple testing with side information," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 80(4), pages 649-679, September.
    4. Dmitry Shungin & Thomas W. Winkler & Damien C. Croteau-Chonka & Teresa Ferreira & Adam E. Locke & Reedik Mägi & Rona J. Strawbridge & Tune H. Pers & Krista Fischer & Anne E. Justice & Tsegaselassie Wo, 2015. "New genetic loci link adipose and insulin biology to body fat distribution," Nature, Nature, vol. 518(7538), pages 187-196, February.
    5. Max Grazier G'Sell & Stefan Wager & Alexandra Chouldechova & Robert Tibshirani, 2016. "Sequential selection procedures and false discovery rate control," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 78(2), pages 423-444, March.
    6. Zhonghua Liu & Xihong Lin, 2019. "A Geometric Perspective on the Power of Principal Component Association Tests in Multiple Phenotype Studies," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 114(527), pages 975-990, July.
    7. Christopher Genovese & Larry Wasserman, 2002. "Operating characteristics and extensions of the false discovery rate procedure," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 64(3), pages 499-517, August.
    8. Paul F O’Reilly & Clive J Hoggart & Yotsawat Pomyen & Federico C F Calboli & Paul Elliott & Marjo-Riitta Jarvelin & Lachlan J M Coin, 2012. "MultiPhen: Joint Model of Multiple Phenotypes Can Increase Discovery in GWAS," PLOS ONE, Public Library of Science, vol. 7(5), pages 1-1, May.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Cecilia Pessoa Rodrigues & Aindrila Chatterjee & Meike Wiese & Thomas Stehle & Witold Szymanski & Maria Shvedunova & Asifa Akhtar, 2021. "Histone H4 lysine 16 acetylation controls central carbon metabolism and diet-induced obesity in mice," Nature Communications, Nature, vol. 12(1), pages 1-21, December.
    2. Jordi Manuello & Joosung Min & Paul McCarthy & Fidel Alfaro-Almagro & Soojin Lee & Stephen Smith & Lloyd T. Elliott & Anderson M. Winkler & Gwenaëlle Douaud, 2024. "The effects of genetic and modifiable risk factors on brain regions vulnerable to ageing and disease," Nature Communications, Nature, vol. 15(1), pages 1-11, December.
    3. Parsa Akbari & Olukayode A. Sosina & Jonas Bovijn & Karl Landheer & Jonas B. Nielsen & Minhee Kim & Senem Aykul & Tanima De & Mary E. Haas & George Hindy & Nan Lin & Ian R. Dinsmore & Jonathan Z. Luo , 2022. "Multiancestry exome sequencing reveals INHBE mutations associated with favorable fat distribution and protection from diabetes," Nature Communications, Nature, vol. 13(1), pages 1-17, December.
    4. Sonali Pechlivanis & Susanne Moebus & Nils Lehmann & Raimund Erbel & Amir A Mahabadi & Per Hoffmann & Karl-Heinz Jöckel & Markus M Nöthen & Hagen S Bachmann & on behalf of the Heinz Nixdorf Recall Stu, 2020. "Genetic risk scores for coronary artery disease and its traditional risk factors: Their role in the progression of coronary artery calcification—Results of the Heinz Nixdorf Recall study," PLOS ONE, Public Library of Science, vol. 15(5), pages 1-18, May.
    5. Hazewinkel, Audinga-Dea & Richmond, Rebecca C. & Wade, Kaitlin H. & Dixon, Padraig, 2022. "Mendelian randomization analysis of the causal impact of body mass index and waist-hip ratio on rates of hospital admission," Economics & Human Biology, Elsevier, vol. 44(C).
    6. Wang, Jiangzhou & Cui, Tingting & Zhu, Wensheng & Wang, Pengfei, 2023. "Covariate-modulated large-scale multiple testing under dependence," Computational Statistics & Data Analysis, Elsevier, vol. 180(C).
    7. Jaakko Pehkonen & Jutta Viinikainen & Jaana T. Kari & Petri Böckerman & Terho Lehtimäki & Olli Raitakari, 2021. "Birth weight and adult income: An examination of mediation through adult height and body mass," Health Economics, John Wiley & Sons, Ltd., vol. 30(10), pages 2383-2398, September.
    8. Saaket Agrawal & Minxian Wang & Marcus D. R. Klarqvist & Kirk Smith & Joseph Shin & Hesam Dashti & Nathaniel Diamant & Seung Hoan Choi & Sean J. Jurgens & Patrick T. Ellinor & Anthony Philippakis & Me, 2022. "Inherited basis of visceral, abdominal subcutaneous and gluteofemoral fat depots," Nature Communications, Nature, vol. 13(1), pages 1-17, December.
    9. Jeng, X. Jessie & Chen, Xiongzhi, 2019. "Predictor ranking and false discovery proportion control in high-dimensional regression," Journal of Multivariate Analysis, Elsevier, vol. 171(C), pages 163-175.
    10. Tingting Cui & Pengfei Wang & Wensheng Zhu, 2021. "Covariate-adjusted multiple testing in genome-wide association studies via factorial hidden Markov models," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 30(3), pages 737-757, September.
    11. Shiyun Chen & Ery Arias-Castro, 2021. "On the power of some sequential multiple testing procedures," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 73(2), pages 311-336, April.
    12. Jacob Joseph & Chang Liu & Qin Hui & Krishna Aragam & Zeyuan Wang & Brian Charest & Jennifer E. Huffman & Jacob M. Keaton & Todd L. Edwards & Serkalem Demissie & Luc Djousse & Juan P. Casas & J. Micha, 2022. "Genetic architecture of heart failure with preserved versus reduced ejection fraction," Nature Communications, Nature, vol. 13(1), pages 1-14, December.
    13. Vikesh Amin & Jere R. Behrman & Jason M. Fletcher & Carlos A. Flores & Alfonso Flores‐Lagunes & Hans‐Peter Kohler, 2021. "Genetic risks, adolescent health, and schooling attainment," Health Economics, John Wiley & Sons, Ltd., vol. 30(11), pages 2905-2920, November.
    14. Saaket Agrawal & Marcus D. R. Klarqvist & Nathaniel Diamant & Takara L. Stanley & Patrick T. Ellinor & Nehal N. Mehta & Anthony Philippakis & Kenney Ng & Melina Claussnitzer & Steven K. Grinspoon & Pu, 2023. "BMI-adjusted adipose tissue volumes exhibit depot-specific and divergent associations with cardiometabolic diseases," Nature Communications, Nature, vol. 14(1), pages 1-10, December.
    15. Lili Liu & Atlas Khan & Elena Sanchez-Rodriguez & Francesca Zanoni & Yifu Li & Nicholas Steers & Olivia Balderes & Junying Zhang & Priya Krithivasan & Robert A. LeDesma & Clara Fischman & Scott J. Heb, 2022. "Genetic regulation of serum IgA levels and susceptibility to common immune, infectious, kidney, and cardio-metabolic traits," Nature Communications, Nature, vol. 13(1), pages 1-17, December.
    16. A Bottle & P Aylin, 2011. "Predicting the false alarm rate in multi-institution mortality monitoring," Journal of the Operational Research Society, Palgrave Macmillan;The OR Society, vol. 62(9), pages 1711-1718, September.
    17. Ebrahimi, Nader, 2008. "Simultaneous control of false positives and false negatives in multiple hypotheses testing," Journal of Multivariate Analysis, Elsevier, vol. 99(3), pages 437-450, March.
    18. X. Jessie Jeng & Huimin Peng & Wenbin Lu, 2021. "Model Selection With Mixed Variables on the Lasso Path," Sankhya B: The Indian Journal of Statistics, Springer;Indian Statistical Institute, vol. 83(1), pages 170-184, May.
    19. Kai Wang, 2014. "Testing Genetic Association by Regressing Genotype over Multiple Phenotypes," PLOS ONE, Public Library of Science, vol. 9(9), pages 1-9, September.
    20. B. Moerkerke & E. Goetghebeur & J. De Riek & I. Roldán‐Ruiz, 2006. "Significance and impotence: towards a balanced view of the null and the alternative hypotheses in marker selection for plant breeding," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 169(1), pages 61-79, January.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:ecl:stabus:4031. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: the person in charge (email available below). General contact details of provider: https://edirc.repec.org/data/gsstaus.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.