IDEAS home Printed from https://ideas.repec.org/a/eee/csdana/v53y2009i5p1688-1700.html
   My bibliography  Save this article

The beta-binomial distribution for estimating the number of false rejections in microarray gene expression studies

Author

Listed:
  • Hunt, Daniel L.
  • Cheng, Cheng
  • Pounds, Stanley

Abstract

In differential expression analysis of microarray data, it is common to assume independence among null hypotheses (and thus gene expression levels). The independence assumption implies that the number of false rejections V follows a binomial distribution and leads to an estimator of the empirical false discovery rate (eFDR). The number of false rejections V is modeled with the beta-binomial distribution. An estimator of the beta-binomial false discovery rate (bbFDR) is then derived. This approach accounts for how the correlation among non-differentially expressed genes influences the distribution of V. Permutations are used to generate the observed values for V under the null hypotheses and a beta-binomial distribution is fit to the values of V. The bbFDR estimator is compared to the eFDR estimator in simulation studies of correlated non-differentially expressed genes and is found to outperform the eFDR for certain scenarios. As an example, this method is also used to perform an analysis that compares the gene expression of soft-tissue sarcoma samples to normal-tissue samples.

Suggested Citation

  • Hunt, Daniel L. & Cheng, Cheng & Pounds, Stanley, 2009. "The beta-binomial distribution for estimating the number of false rejections in microarray gene expression studies," Computational Statistics & Data Analysis, Elsevier, vol. 53(5), pages 1688-1700, March.
  • Handle: RePEc:eee:csdana:v:53:y:2009:i:5:p:1688-1700
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0167-9473(08)00025-X
    Download Restriction: Full text for ScienceDirect subscribers only.

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. John D. Storey & Jonathan E. Taylor & David Siegmund, 2004. "Strong control, conservative point estimation and simultaneous conservative consistency of false discovery rates: a unified approach," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 66(1), pages 187-205.
    2. Chen-An Tsai & Huey-miin Hsueh & James J. Chen, 2003. "Estimation of False Discovery Rates in Multiple Testing: Application to Gene Microarray Data," Biometrics, The International Biometric Society, vol. 59(4), pages 1071-1081, December.
    3. Cheng Cheng & Pounds Stanley B. & Boyett James M. & Pei Deqing & Kuo Mei-Ling & Roussel Martine F., 2004. "Statistical Significance Threshold Criteria For Analysis of Microarray Gene Expression Data," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 3(1), pages 1-32, December.
    4. Allison, David B. & Gadbury, Gary L. & Heo, Moonseong & Fernandez, Jose R. & Lee, Cheol-Koo & Prolla, Tomas A. & Weindruch, Richard, 2002. "A mixture model approach for the analysis of microarray gene expression data," Computational Statistics & Data Analysis, Elsevier, vol. 39(1), pages 1-20, March.
    5. John D. Storey, 2002. "A direct approach to false discovery rates," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 64(3), pages 479-498.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Lyra, M. & Paha, J. & Paterlini, S. & Winker, P., 2010. "Optimization heuristics for determining internal rating grading scales," Computational Statistics & Data Analysis, Elsevier, vol. 54(11), pages 2693-2706, November.
    2. David E. Giles, 2012. "Exact Asymptotic Goodness-of-Fit Testing For Discrete Circular Data, With Applications," Econometrics Working Papers 1201, Department of Economics, University of Victoria.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:csdana:v:53:y:2009:i:5:p:1688-1700. See general information about how to correct material in RePEc.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: (Dana Niculescu). General contact details of provider: http://www.elsevier.com/locate/csda .

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service hosted by the Research Division of the Federal Reserve Bank of St. Louis . RePEc uses bibliographic data supplied by the respective publishers.