Applying Machine Learning in Distributed Data Networks for Pharmacoepidemiologic and Pharmacovigilance Studies: Opportunities, Challenges, and Considerations

My bibliography Save this article

Applying Machine Learning in Distributed Data Networks for Pharmacoepidemiologic and Pharmacovigilance Studies: Opportunities, Challenges, and Considerations

Author

Listed:

Jenna Wong
(Harvard Medical School & Harvard Pilgrim Health Care Institute)
Daniel Prieto-Alhambra
(NDORMS, University of Oxford
Erasmus University Medical Center)
Peter R. Rijnbeek
(Erasmus University Medical Center)
Rishi J. Desai
(Harvard Medical School)
Jenna M. Reps
(Janssen Research & Development, LLC)
Sengwee Toh
(Harvard Medical School & Harvard Pilgrim Health Care Institute)

Registered:

Abstract

Increasing availability of electronic health databases capturing real-world experiences with medical products has garnered much interest in their use for pharmacoepidemiologic and pharmacovigilance studies. The traditional practice of having numerous groups use single databases to accomplish similar tasks and address common questions about medical products can be made more efficient through well-coordinated multi-database studies, greatly facilitated through distributed data network (DDN) architectures. Access to larger amounts of electronic health data within DDNs has created a growing interest in using data-adaptive machine learning (ML) techniques that can automatically model complex associations in high-dimensional data with minimal human guidance. However, the siloed storage and diverse nature of the databases in DDNs create unique challenges for using ML. In this paper, we discuss opportunities, challenges, and considerations for applying ML in DDNs for pharmacoepidemiologic and pharmacovigilance studies. We first discuss major types of activities performed by DDNs and how ML may be used. Next, we discuss practical data-related factors influencing how DDNs work in practice. We then combine these discussions and jointly consider how opportunities for ML are affected by practical data-related factors for DDNs, leading to several challenges. We present different approaches for addressing these challenges and highlight efforts that real-world DDNs have taken or are currently taking to help mitigate them. Despite these challenges, the time is ripe for the emerging interest to use ML in DDNs, and the utility of these data-adaptive modeling techniques in pharmacoepidemiologic and pharmacovigilance studies will likely continue to increase in the coming years.

Suggested Citation

Jenna Wong & Daniel Prieto-Alhambra & Peter R. Rijnbeek & Rishi J. Desai & Jenna M. Reps & Sengwee Toh, 2022. "Applying Machine Learning in Distributed Data Networks for Pharmacoepidemiologic and Pharmacovigilance Studies: Opportunities, Challenges, and Considerations," Drug Safety, Springer, vol. 45(5), pages 493-510, May.

Handle: RePEc:spr:drugsa:v:45:y:2022:i:5:d:10.1007_s40264-022-01158-3
DOI: 10.1007/s40264-022-01158-3

Download full text from publisher

As the access to this document is restricted, you may want to

for a different version of it.

References listed on IDEAS

Qiong Wang & Jenna M Reps & Kristin Feeney Kostka & Patrick B Ryan & Yuhui Zou & Erica A Voss & Peter R Rijnbeek & RuiJun Chen & Gowtham A Rao & Henry Morgan Stewart & Andrew E Williams & Ross D Willi, 2020. "Development and validation of a prognostic model predicting symptomatic hemorrhagic transformation in acute ischemic stroke at scale in the OHDSI network," PLOS ONE, Public Library of Science, vol. 15(1), pages 1-12, January.
van der Laan Mark J. & Rubin Daniel, 2006. "Targeted Maximum Likelihood Learning," The International Journal of Biostatistics, De Gruyter, vol. 2(1), pages 1-40, December.
Jenny W Sun & Jessica M Franklin & Kathryn Rough & Rishi J Desai & Sonia Hernández-Díaz & Krista F Huybrechts & Brian T Bateman, 2020. "Predicting overdose among individuals prescribed opioids using routinely collected healthcare utilization data," PLOS ONE, Public Library of Science, vol. 15(10), pages 1-17, October.
Wright, George & Lawrence, Michael J. & Collopy, Fred, 1996. "The role and validity of judgment in forecasting," International Journal of Forecasting, Elsevier, vol. 12(1), pages 1-8, March.

Full references (including those not matched with items on IDEAS)

Most related items

These are the items that most often cite the same works as this one and are cited by the same works as this one.

Susan Athey & Guido W. Imbens & Stefan Wager, 2018. "Approximate residual balancing: debiased inference of average treatment effects in high dimensions," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 80(4), pages 597-623, September.
- Susan Athey & Guido W. Imbens & Stefan Wager, 2016. "Approximate Residual Balancing: De-Biased Inference of Average Treatment Effects in High Dimensions," Papers 1604.07125, arXiv.org, revised Jan 2018.
S Ariane Christie & Amanda S Conroy & Rachael A Callcut & Alan E Hubbard & Mitchell J Cohen, 2019. "Dynamic multi-outcome prediction after injury: Applying adaptive machine learning for precision medicine in trauma," PLOS ONE, Public Library of Science, vol. 14(4), pages 1-13, April.
Waverly Wei & Maya Petersen & Mark J van der Laan & Zeyu Zheng & Chong Wu & Jingshen Wang, 2023. "Efficient targeted learning of heterogeneous treatment effects for multiple subgroups," Biometrics, The International Biometric Society, vol. 79(3), pages 1934-1946, September.
Michael Rosenblum & Nicholas P. Jewell & Mark van der Laan & Stephen Shiboski & Ariane van der Straten & Nancy Padian, 2009. "Analysing direct effects in randomized trials with secondary interventions: an application to human immunodeficiency virus prevention trials," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 172(2), pages 443-465, April.
Victor Chernozhukov & Whitney K. Newey & Victor Quintas-Martinez & Vasilis Syrgkanis, 2021. "Automatic Debiased Machine Learning via Riesz Regression," Papers 2104.14737, arXiv.org, revised Mar 2024.
Paul Frédéric Blanche & Anders Holt & Thomas Scheike, 2023. "On logistic regression with right censored data, with or without competing risks, and its use for estimating treatment effects," Lifetime Data Analysis: An International Journal Devoted to Statistical Methods and Applications for Time-to-Event Data, Springer, vol. 29(2), pages 441-482, April.
Jun Wang & Yahe Yu, 2024. "Improved estimation of average treatment effects under covariate‐adaptive randomization methods," Statistica Neerlandica, Netherlands Society for Statistics and Operations Research, vol. 78(2), pages 310-333, May.
Yiyi Huo & Yingying Fan & Fang Han, 2023. "On the adaptation of causal forests to manifold data," Papers 2311.16486, arXiv.org, revised Dec 2023.
Michael Lechner, 2023. "Causal Machine Learning and its use for public policy," Swiss Journal of Economics and Statistics, Springer;Swiss Society of Economics and Statistics, vol. 159(1), pages 1-15, December.
Stitelman Ori M & van der Laan Mark J., 2010. "Collaborative Targeted Maximum Likelihood for Time to Event Data," The International Journal of Biostatistics, De Gruyter, vol. 6(1), pages 1-46, June.
Martin Huber & Michael Lechner & Giovanni Mellace, 2016. "The Finite Sample Performance of Estimators for Mediation Analysis Under Sequential Conditional Independence," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 34(1), pages 139-160, January.
- Huber, Martin & Mellace, Giovanni & Lechner, Michael, 2014. "The finite sample performance of estimators for mediation analysis under sequential conditional independence," Economics Working Paper Series 1415, University of St. Gallen, School of Economics and Political Science, revised Nov 2014.
Gruber Susan & van der Laan Mark J., 2010. "A Targeted Maximum Likelihood Estimator of a Causal Effect on a Bounded Continuous Outcome," The International Journal of Biostatistics, De Gruyter, vol. 6(1), pages 1-18, August.
Kara E. Rudolph & Jonathan Levy & Mark J. van der Laan, 2021. "Transporting stochastic direct and indirect effects to new populations," Biometrics, The International Biometric Society, vol. 77(1), pages 197-211, March.
Gruber Susan & van der Laan Mark J., 2010. "An Application of Collaborative Targeted Maximum Likelihood Estimation in Causal Inference and Genomics," The International Journal of Biostatistics, De Gruyter, vol. 6(1), pages 1-31, May.
Michael C Knaus, 2022. "Double machine learning-based programme evaluation under unconfoundedness [Econometric methods for program evaluation]," The Econometrics Journal, Royal Economic Society, vol. 25(3), pages 602-627.
- Knaus, Michael C., 2020. "Double Machine Learning based Program Evaluation under Unconfoundedness," Economics Working Paper Series 2004, University of St. Gallen, School of Economics and Political Science.
- Knaus, Michael C., 2020. "Double Machine Learning Based Program Evaluation under Unconfoundedness," IZA Discussion Papers 13051, Institute of Labor Economics (IZA).
- Michael C. Knaus, 2020. "Double Machine Learning based Program Evaluation under Unconfoundedness," Papers 2003.03191, arXiv.org, revised Jun 2022.
Antonelli Joseph & Cefalu Matthew, 2020. "Averaging causal estimators in high dimensions," Journal of Causal Inference, De Gruyter, vol. 8(1), pages 92-107, January.
Tuglus Catherine & van der Laan Mark J., 2011. "Repeated Measures Semiparametric Regression Using Targeted Maximum Likelihood Methodology with Application to Transcription Factor Activity Discovery," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 10(1), pages 1-31, January.
Yuya Sasaki & Takuya Ura & Yichong Zhang, 2022. "Unconditional quantile regression with high‐dimensional data," Quantitative Economics, Econometric Society, vol. 13(3), pages 955-978, July.
- Yuya Sasaki & Takuya Ura & Yichong Zhang, 2020. "Unconditional Quantile Regression with High Dimensional Data," Papers 2007.13659, arXiv.org, revised Feb 2022.
Zhang, Yingheng & Li, Haojie & Ren, Gang, 2025. "Analysing the role of traffic volume as mediator in transport policy evaluation with causal mediation analysis and targeted learning," Transportation Research Part A: Policy and Practice, Elsevier, vol. 192(C).
Iván Díaz & Elizabeth Colantuoni & Daniel F. Hanley & Michael Rosenblum, 2019. "Improved precision in the analysis of randomized trials with survival outcomes, without assuming proportional hazards," Lifetime Data Analysis: An International Journal Devoted to Statistical Methods and Applications for Time-to-Event Data, Springer, vol. 25(3), pages 439-468, July.

More about this item

Statistics

Access and download statistics

Corrections

All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:drugsa:v:45:y:2022:i:5:d:10.1007_s40264-022-01158-3. See general information about how to correct material in RePEc.

If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com/economics/journal/40264 .

Please note that corrections may take a couple of weeks to filter through the various RePEc services.

IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.

Browse Econ Literature

More features

Applying Machine Learning in Distributed Data Networks for Pharmacoepidemiologic and Pharmacovigilance Studies: Opportunities, Challenges, and Considerations

Author

Abstract

Suggested Citation

Download full text from publisher

References listed on IDEAS

Most related items

More about this item

Statistics

Corrections

More services and features

MyIDEAS

Author registration

Rankings

RePEc Genealogy

RePEc Biblio

MPRA

New papers by email

EconAcademics

Plagiarism

About RePEc

RePEc home

Blog

Help/FAQ

RePEc team

Participating archives

Privacy statement

Help us

Corrections

Volunteers

Get papers listed

Open a RePEc archive

Get RePEc data