Estimation of empirical null using a mixture of normals and its use in local false discovery rate
AbstractWhen high dimensional microarray data is given, it is of interest to select significant genes by controlling a given level of Type-I error. One popular way to control the level is the false discovery rate (FDR). This paper considers gene selection based on the local false discovery rate. In most of the previous studies, the null distribution of gene expression is commonly assumed to be a normal distribution. However, if the null distribution has heavier tail than that of normal, there may exist too many false discoveries leading to the failure of controlling the given level of FDR. We propose a novel procedure which enriches a class of null distribution based on a mixture of normals. We present simulation studies to show that our proposed procedure is less sensitive to variation of null distribution than local false discovery rate with a single normal for the null. We also provide real example of gene expression profiles of antigen-specific human CD8+ T-lymphocytes treated with cytokine Interleukin-2 (IL-2) and Interleukin-15 (IL-15) for comparison.
Download InfoIf you experience problems downloading a file, check if you have the proper application to view it first. In case of further problems read the IDEAS help page. Note that these files are not on the IDEAS site. Please be patient as the files may be large.
As the access to this document is restricted, you may want to look for a different version under "Related research" (further below) or search for a different version of it.
Bibliographic InfoArticle provided by Elsevier in its journal Computational Statistics & Data Analysis.
Volume (Year): 55 (2011)
Issue (Month): 7 (July)
Contact details of provider:
Web page: http://www.elsevier.com/locate/csda
Local false discovery rate Normal mixture Sparsity Gene selection;
Please report citation or reference errors to , or , if you are the registered author of the cited work, log in to your RePEc Author Service profile, click on "citations" and make appropriate adjustments.:
- Pounds, Stan & Rai, Shesh N., 2009. "Assumption adequacy averaging as a concept for developing more robust methods for differential gene expression analysis," Computational Statistics & Data Analysis, Elsevier, vol. 53(5), pages 1604-1612, March.
- Allison, David B. & Gadbury, Gary L. & Heo, Moonseong & Fernandez, Jose R. & Lee, Cheol-Koo & Prolla, Tomas A. & Weindruch, Richard, 2002. "A mixture model approach for the analysis of microarray gene expression data," Computational Statistics & Data Analysis, Elsevier, vol. 39(1), pages 1-20, March.
- van der Laan Mark J. & Birkner Merrill D. & Hubbard Alan E., 2005. "Empirical Bayes and Resampling Based Multiple Testing Procedure Controlling Tail Probability of the Proportion of False Positives," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 4(1), pages 1-32, October.
- Robin, Stephane & Bar-Hen, Avner & Daudin, Jean-Jacques & Pierre, Laurent, 2007. "A semi-parametric approach for mixture models: Application to local false discovery rate estimation," Computational Statistics & Data Analysis, Elsevier, vol. 51(12), pages 5483-5493, August.
- Efron, Bradley, 2004. "Large-Scale Simultaneous Hypothesis Testing: The Choice of a Null Hypothesis," Journal of the American Statistical Association, American Statistical Association, vol. 99, pages 96-104, January.
- John D. Storey, 2002. "A direct approach to false discovery rates," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 64(3), pages 479-498.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: (Wendy Shamier).
If references are entirely missing, you can add them using this form.