IDEAS home Printed from https://ideas.repec.org/a/plo/pgen00/1011022.html
   My bibliography  Save this article

Methods for mediation analysis with high-dimensional DNA methylation data: Possible choices and comparisons

Author

Listed:
  • Dylan Clark-Boucher
  • Xiang Zhou
  • Jiacong Du
  • Yongmei Liu
  • Belinda L Needham
  • Jennifer A Smith
  • Bhramar Mukherjee

Abstract

Epigenetic researchers often evaluate DNA methylation as a potential mediator of the effect of social/environmental exposures on a health outcome. Modern statistical methods for jointly evaluating many mediators have not been widely adopted. We compare seven methods for high-dimensional mediation analysis with continuous outcomes through both diverse simulations and analysis of DNAm data from a large multi-ethnic cohort in the United States, while providing an R package for their seamless implementation and adoption. Among the considered choices, the best-performing methods for detecting active mediators in simulations are the Bayesian sparse linear mixed model (BSLMM) and high-dimensional mediation analysis (HDMA); while the preferred methods for estimating the global mediation effect are high-dimensional linear mediation analysis (HILMA) and principal component mediation analysis (PCMA). We provide guidelines for epigenetic researchers on choosing the best method in practice and offer suggestions for future methodological development.Author summary: DNA methylation is an epigenetic mechanism that regulates the expression of genes, turning them “on” or “off” to meet the needs of the cell. Changes in methylation activity are associated with both health conditions and socioeconomic factors like education and access to healthcare. Recently, researchers have been interested in whether DNA methylation may act as a link between socioeconomic disadvantage and health. Standard methods to investigate whether DNA methylation is a link, or a mediator, between disadvantage and health do not work well when there are multiple mediators—in this case, DNA methylation sites—under consideration. Our study reviews 12 statistical methods for mediation analysis that can be used to analyze many methylation sites simultaneously. We compare the methods on simulated data and provide guidelines and software for their implementation. We then demonstrate how the methods can be applied to real methylation data by testing whether DNA methylation sites across the genome mediate the effect of lower educational attainment on HbA1c, an important marker of type II diabetes.

Suggested Citation

  • Dylan Clark-Boucher & Xiang Zhou & Jiacong Du & Yongmei Liu & Belinda L Needham & Jennifer A Smith & Bhramar Mukherjee, 2023. "Methods for mediation analysis with high-dimensional DNA methylation data: Possible choices and comparisons," PLOS Genetics, Public Library of Science, vol. 19(11), pages 1-26, November.
  • Handle: RePEc:plo:pgen00:1011022
    DOI: 10.1371/journal.pgen.1011022
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosgenetics/article?id=10.1371/journal.pgen.1011022
    Download Restriction: no

    File URL: https://journals.plos.org/plosgenetics/article/file?id=10.1371/journal.pgen.1011022&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pgen.1011022?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Zou, Hui, 2006. "The Adaptive Lasso and Its Oracle Properties," Journal of the American Statistical Association, American Statistical Association, vol. 101, pages 1418-1429, December.
    2. Andriy Derkach & Ruth M. Pfeiffer & Ting‐Huei Chen & Joshua N. Sampson, 2019. "High dimensional mediation analysis with latent variables," Biometrics, The International Biometric Society, vol. 75(3), pages 745-756, September.
    3. Yanyi Song & Xiang Zhou & Min Zhang & Wei Zhao & Yongmei Liu & Sharon L. R. Kardia & Ana V. Diez Roux & Belinda L. Needham & Jennifer A. Smith & Bhramar Mukherjee, 2020. "Bayesian shrinkage estimation of high dimensional causal mediation effects in omics studies," Biometrics, The International Biometric Society, vol. 76(3), pages 700-710, September.
    4. Ruixuan Rachel Zhou & Liewei Wang & Sihai Dave Zhao, 2020. "Estimation and inference for the indirect effect in high-dimensional linear mediation models," Biometrika, Biometrika Trust, vol. 107(3), pages 573-589.
    5. Xu Guo & Runze Li & Jingyuan Liu & Mudong Zeng, 2022. "High-Dimensional Mediation Analysis for Selecting DNA Methylation Loci Mediating Childhood Trauma and Cortisol Stress Reactivity," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 117(539), pages 1110-1121, September.
    6. Zhao, Yi & Lindquist, Martin A. & Caffo, Brian S., 2020. "Sparse principal component based high-dimensional mediation analysis," Computational Statistics & Data Analysis, Elsevier, vol. 142(C).
    7. Yanyi Song & Xiang Zhou & Jian Kang & Max T. Aung & Min Zhang & Wei Zhao & Belinda L. Needham & Sharon L. R. Kardia & Yongmei Liu & John D. Meeker & Jennifer A. Smith & Bhramar Mukherjee, 2021. "Bayesian sparse mediation analysis with targeted penalization of natural indirect effects," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 70(5), pages 1391-1412, November.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Meng An & Haixiang Zhang, 2023. "High-Dimensional Mediation Analysis for Time-to-Event Outcomes with Additive Hazards Model," Mathematics, MDPI, vol. 11(24), pages 1-11, December.
    2. Caubet, Miguel & Samoilenko, Mariia & Drouin, Simon & Sinnett, Daniel & Krajinovic, Maja & Laverdière, Caroline & Marcil, Valérie & Lefebvre, Geneviève, 2023. "Bayesian joint modeling for causal mediation analysis with a binary outcome and a binary mediator: Exploring the role of obesity in the association between cranial radiation therapy for childhood acut," Computational Statistics & Data Analysis, Elsevier, vol. 177(C).
    3. Yi Zhao & Lexin Li & Brian S. Caffo, 2021. "Multimodal neuroimaging data integration and pathway analysis," Biometrics, The International Biometric Society, vol. 77(3), pages 879-889, September.
    4. T. Tony Cai & Zijian Guo & Yin Xia, 2023. "Statistical inference and large-scale multiple testing for high-dimensional regression models," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 32(4), pages 1135-1171, December.
    5. Jade Xiaoqing Wang & Yimei Li & Wilburn E. Reddick & Heather M. Conklin & John O. Glass & Arzu Onar‐Thomas & Amar Gajjar & Cheng Cheng & Zhao‐Hua Lu, 2023. "A high‐dimensional mediation model for a neuroimaging mediator: Integrating clinical, neuroimaging, and neurocognitive data to mitigate late effects in pediatric cancer," Biometrics, The International Biometric Society, vol. 79(3), pages 2430-2443, September.
    6. Lulu Shang & Wei Zhao & Yi Zhe Wang & Zheng Li & Jerome J. Choi & Minjung Kho & Thomas H. Mosley & Sharon L. R. Kardia & Jennifer A. Smith & Xiang Zhou, 2023. "meQTL mapping in the GENOA study reveals genetic determinants of DNA methylation in African Americans," Nature Communications, Nature, vol. 14(1), pages 1-16, December.
    7. Qi Zhang, 2022. "High-Dimensional Mediation Analysis with Applications to Causal Gene Identification," Statistics in Biosciences, Springer;International Chinese Statistical Association, vol. 14(3), pages 432-451, December.
    8. Tutz, Gerhard & Pößnecker, Wolfgang & Uhlmann, Lorenz, 2015. "Variable selection in general multinomial logit models," Computational Statistics & Data Analysis, Elsevier, vol. 82(C), pages 207-222.
    9. Margherita Giuzio, 2017. "Genetic algorithm versus classical methods in sparse index tracking," Decisions in Economics and Finance, Springer;Associazione per la Matematica, vol. 40(1), pages 243-256, November.
    10. Xu, Yang & Zhao, Shishun & Hu, Tao & Sun, Jianguo, 2021. "Variable selection for generalized odds rate mixture cure models with interval-censored failure time data," Computational Statistics & Data Analysis, Elsevier, vol. 156(C).
    11. Emmanouil Androulakis & Christos Koukouvinos & Kalliopi Mylona & Filia Vonta, 2010. "A real survival analysis application via variable selection methods for Cox's proportional hazards model," Journal of Applied Statistics, Taylor & Francis Journals, vol. 37(8), pages 1399-1406.
    12. Li, Chunyu & Lou, Chenxin & Luo, Dan & Xing, Kai, 2021. "Chinese corporate distress prediction using LASSO: The role of earnings management," International Review of Financial Analysis, Elsevier, vol. 76(C).
    13. Ying Huang & Shibasish Dasgupta, 2019. "Likelihood-Based Methods for Assessing Principal Surrogate Endpoints in Vaccine Trials," Statistics in Biosciences, Springer;International Chinese Statistical Association, vol. 11(3), pages 504-523, December.
    14. Sophie Brana & Dalila Chenaf-Nicet & Delphine Lahet, 2023. "Drivers of cross-border bank claims: The role of foreign-owned banks in emerging countries," Working Papers 2023.06, International Network for Economic Research - INFER.
    15. Mkhadri, Abdallah & Ouhourane, Mohamed, 2013. "An extended variable inclusion and shrinkage algorithm for correlated variables," Computational Statistics & Data Analysis, Elsevier, vol. 57(1), pages 631-644.
    16. Ni, Xiao & Zhang, Hao Helen & Zhang, Daowen, 2009. "Automatic model selection for partially linear models," Journal of Multivariate Analysis, Elsevier, vol. 100(9), pages 2100-2111, October.
    17. Avagyan, Vahe & Alonso Fernández, Andrés Modesto & Nogales, Francisco J., 2015. "D-trace Precision Matrix Estimation Using Adaptive Lasso Penalties," DES - Working Papers. Statistics and Econometrics. WS 21775, Universidad Carlos III de Madrid. Departamento de Estadística.
    18. Byron Botha & Rulof Burger & Kevin Kotzé & Neil Rankin & Daan Steenkamp, 2023. "Big data forecasting of South African inflation," Empirical Economics, Springer, vol. 65(1), pages 149-188, July.
    19. Yanlin Tang & Xinyuan Song & Zhongyi Zhu, 2015. "Variable selection via composite quantile regression with dependent errors," Statistica Neerlandica, Netherlands Society for Statistics and Operations Research, vol. 69(1), pages 1-20, February.
    20. Gustavo Peralta, 2016. "The Nature of Volatility Spillovers across the International Capital Markets," CNMV Working Papers CNMV Working Papers no. 6, CNMV- Spanish Securities Markets Commission - Research and Statistics Department.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pgen00:1011022. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosgenetics (email available below). General contact details of provider: https://journals.plos.org/plosgenetics/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.