A Method to Increase the Power of Multiple Testing Procedures Through Sample Splitting
Consider the standard multiple testing problem where many hypotheses are to be tested, each hypothesis is associated with a test statistic, and large test statistics provide evidence against the null hypotheses. One proposal to provide probabilistic control of Type-I errors is the use of procedures ensuring that the expected number of false positives does not exceed a user-supplied threshold. Among such multiple testing procedures, we derive the most powerful method, meaning the test statistic cutoffs that maximize the expected number of true positives. Unfortunately, these optimal cutoffs depend on the true unknown data generating distribution, so could never be used in a practical setting. We instead consider splitting the sample so that the optimal cutoffs are estimated from a portion of the data, and then testing on the remaining data using these estimated cutoffs. When the null distributions for all test statistics are the same, the obvious way to control the expected number of false positives would be to use a common cutoff for all tests. In this work, we consider the common cutoff method as a benchmark multiple testing procedure. We show that in certain circumstances the use of estimated optimal cutoffs via sample splitting can dramatically outperform this benchmark method, resulting in increased true discoveries, while retaining Type-I error control. This paper is an updated version of the work presented in Rubin et al. (2005), later expanded upon by Wasserman and Roeder (2006).
If you experience problems downloading a file, check if you have the proper application to view it first. In case of further problems read the IDEAS help page. Note that these files are not on the IDEAS site. Please be patient as the files may be large.
As the access to this document is restricted, you may want to look for a different version under "Related research" (further below) or search for a different version of it.
Volume (Year): 5 (2006)
Issue (Month): 1 (August)
|Contact details of provider:|| Web page: https://www.degruyter.com|
|Order Information:||Web: https://www.degruyter.com/view/j/sagmb|
References listed on IDEAS
Please report citation or reference errors to , or , if you are the registered author of the cited work, log in to your RePEc Author Service profile, click on "citations" and make appropriate adjustments.:
- Sandrine Dudoit & Mark van der Laan & Katherine Pollard, 2004. "Multiple Testing. Part I. Single-Step Procedures for Control of General Type I Error Rates," U.C. Berkeley Division of Biostatistics Working Paper Series 1137, Berkeley Electronic Press.
- Dudoit Sandrine & van der Laan Mark J. & Pollard Katherine S., 2004. "Multiple Testing. Part I. Single-Step Procedures for Control of General Type I Error Rates," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 3(1), pages 1-71, June.