IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0314014.html
   My bibliography  Save this article

A novel family of beta mixture models for the differential analysis of DNA methylation data: An application to prostate cancer

Author

Listed:
  • Koyel Majumdar
  • Romina Silva
  • Antoinette Sabrina Perry
  • Ronald William Watson
  • Andrea Rau
  • Florence Jaffrezic
  • Thomas Brendan Murphy
  • Isobel Claire Gormley

Abstract

Identifying differentially methylated cytosine-guanine dinucleotide (CpG) sites between benign and tumour samples can assist in understanding disease. However, differential analysis of bounded DNA methylation data often requires data transformation, reducing biological interpretability. To address this, a family of beta mixture models (BMMs) is proposed that (i) objectively infers methylation state thresholds and (ii) identifies differentially methylated CpG sites (DMCs) given untransformed, beta-valued methylation data. The BMMs achieve this through model-based clustering of CpG sites and by employing parameter constraints, facilitating application to different study settings. Inference proceeds via an expectation-maximisation algorithm, with an approximate maximization step providing tractability and computational feasibility. Performance of the BMMs is assessed through thorough simulation studies, and the BMMs are used for differential analyses of DNA methylation data from a prostate cancer study. Intuitive and biologically interpretable methylation state thresholds are inferred and DMCs are identified, including those related to genes such as GSTP1, RASSF1 and RARB, known for their role in prostate cancer development. Gene ontology analysis of the DMCs revealed significant enrichment in cancer-related pathways, demonstrating the utility of BMMs to reveal biologically relevant insights. An R package betaclust facilitates widespread use of BMMs.

Suggested Citation

  • Koyel Majumdar & Romina Silva & Antoinette Sabrina Perry & Ronald William Watson & Andrea Rau & Florence Jaffrezic & Thomas Brendan Murphy & Isobel Claire Gormley, 2024. "A novel family of beta mixture models for the differential analysis of DNA methylation data: An application to prostate cancer," PLOS ONE, Public Library of Science, vol. 19(12), pages 1-21, December.
  • Handle: RePEc:plo:pone00:0314014
    DOI: 10.1371/journal.pone.0314014
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0314014
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0314014&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pone.0314014?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Konstantin Schildknecht & Sven Olek & Thorsten Dickhaus, 2015. "Simultaneous Statistical Inference for Epigenetic Data," PLOS ONE, Public Library of Science, vol. 10(5), pages 1-15, May.
    2. Ernst R. Berndt & Bronwyn H. Hall & Robert E. Hall & Jerry A. Hausman, 1974. "Estimation and Inference in Nonlinear Structural Models," NBER Chapters, in: Annals of Economic and Social Measurement, Volume 3, number 4, pages 653-665, National Bureau of Economic Research, Inc.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Michelle Sheran Sylvester, 2007. "The Career and Family Choices of Women: A Dynamic Analysis of Labor Force Participation, Schooling, Marriage and Fertility Decisions," Review of Economic Dynamics, Elsevier for the Society for Economic Dynamics, vol. 10(3), pages 367-399, July.
    2. Capps, Oral Jr. & Havlicek, Joseph Jr., 1980. "National And Regional Household Demands For Meats And Seafood In The U.S.: A Complete Systems Approach," 1980 Annual Meeting, July 27-30, Urbana-Champaign, Illinois 278409, American Agricultural Economics Association (New Name 2008: Agricultural and Applied Economics Association).
    3. Weber, Enzo, 2009. "Financial Contagion, Vulnerability and Information Flow: Empirical Identification," University of Regensburg Working Papers in Business, Economics and Management Information Systems 431, University of Regensburg, Department of Economics.
    4. Emilie Alberola & Benoît Chèze & Julien Chevallier, 2008. "The EU Emissions Trading Scheme : Disentangling the Effects of Industrial Production and CO2 Emissions on Carbon Prices," EconomiX Working Papers 2008-12, University of Paris Nanterre, EconomiX.
    5. Tony Caporale & Barbara McKiernan, 1998. "The Fischer Black Hypothesis: Some Time‐Series Evidence," Southern Economic Journal, John Wiley & Sons, vol. 64(3), pages 765-771, January.
    6. Charles, Amélie, 2010. "The day-of-the-week effects on the volatility: The role of the asymmetry," European Journal of Operational Research, Elsevier, vol. 202(1), pages 143-152, April.
    7. Charles K.D. Adjasi, 2009. "Macroeconomic uncertainty and conditional stock-price volatility in frontier African markets: Evidence from Ghana," Journal of Risk Finance, Emerald Group Publishing, vol. 10(4), pages 333-349, August.
    8. Guizzardi, Andrea & Mazzocchi, Mario, 2010. "Tourism demand for Italy and the business cycle," Tourism Management, Elsevier, vol. 31(3), pages 367-377.
    9. Mübariz Hasanov & Tolga Omay, 2011. "The Relationship Between Inflation, Output Growth, and Their Uncertainties: Evidence from Selected CEE Countries," Emerging Markets Finance and Trade, Taylor & Francis Journals, vol. 47(0), pages 5-20, July.
    10. Ballocchi, Giuseppe & Dacorogna, Michel M. & Hopman, Carl M. & Muller, Ulrich A. & Olsen, Richard B., 1999. "The intraday multivariate structure of the Eurofutures markets," Journal of Empirical Finance, Elsevier, vol. 6(5), pages 479-513, December.
    11. Ülkü, Numan & Weber, Enzo, 2013. "Identifying the interaction between stock market returns and trading flows of investor types: Looking into the day using daily data," Journal of Banking & Finance, Elsevier, vol. 37(8), pages 2733-2749.
    12. Abdul Rishad & Sanjeev Gupta & Akhil Sharma, 2021. "Official Intervention and Exchange Rate Determination: Evidence from India," Global Journal of Emerging Market Economies, Emerging Markets Forum, vol. 13(3), pages 357-379, September.
    13. Koutmos, Gregory, 1998. "Asymmetries in the Conditional Mean and the Conditional Variance: Evidence From Nine Stock Markets," Journal of Economics and Business, Elsevier, vol. 50(3), pages 277-290, May.
    14. Mohsen Mehrara & Hossein Tavakolian, 2010. "Inflation, Growth and their Uncertainties: A Bivariate GARCH Evidence for Iran," Iranian Economic Review (IER), Faculty of Economics,University of Tehran.Tehran,Iran, vol. 15(1), pages 83-100, winter.
    15. de Goeij, Peter & Marquering, Wessel, 2009. "Stock and bond market interactions with level and asymmetry dynamics: An out-of-sample application," Journal of Empirical Finance, Elsevier, vol. 16(2), pages 318-329, March.
    16. Jonathan E. Alevy & Michael S. Haigh & John List, 2006. "Information Cascades: Evidence from An Experiment with Financial Market Professionals," NBER Working Papers 12767, National Bureau of Economic Research, Inc.
    17. Juan Luis Nicolau, 2001. "Parametric And Nonparametric Approaches To Event Studies: An Application To A Hotel'S Market Value," Working Papers. Serie AD 2001-08, Instituto Valenciano de Investigaciones Económicas, S.A. (Ivie).
    18. Dong, Diansheng & Kaiser, Harry M., 2003. "Estimation of a Censored AIDS Model: A Simulated Amemiya-Tobin Approach," Research Bulletins 122113, Cornell University, Department of Applied Economics and Management.
    19. Kapetanios, G. & Weeks, M., 2003. "Non-nested Models and the likelihood Ratio Statistic: A Comparison of Simulation and Bootstrap-based Tests," Cambridge Working Papers in Economics 0308, Faculty of Economics, University of Cambridge.
    20. repec:isu:genstf:1999010108000013154 is not listed on IDEAS
    21. Amengual, Dante & Sentana, Enrique, 2010. "A comparison of mean-variance efficiency tests," Journal of Econometrics, Elsevier, vol. 154(1), pages 16-34, January.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0314014. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.