Optimising precision and power by machine learning in randomised trials with ordinal and time‐to‐event outcomes with an application to COVID‐19

My bibliography Save this article

Optimising precision and power by machine learning in randomised trials with ordinal and time‐to‐event outcomes with an application to COVID‐19

Author

Listed:

Nicholas Williams
Michael Rosenblum
Iván Díaz

Registered:

Abstract

The rapid finding of effective therapeutics requires efficient use of available resources in clinical trials. Covariate adjustment can yield statistical estimates with improved precision, resulting in a reduction in the number of participants required to draw futility or efficacy conclusions. We focus on time‐to‐event and ordinal outcomes. When more than a few baseline covariates are available, a key question for covariate adjustment in randomised studies is how to fit a model relating the outcome and the baseline covariates to maximise precision. We present a novel theoretical result establishing conditions for asymptotic normality of a variety of covariate‐adjusted estimators that rely on machine learning (e.g., ℓ1$$ {\ell}_1 $$‐regularisation, Random Forests, XGBoost, and Multivariate Adaptive Regression Splines [MARS]), under the assumption that outcome data are missing completely at random. We further present a consistent estimator of the asymptotic variance. Importantly, the conditions do not require the machine learning methods to converge to the true outcome distribution conditional on baseline variables, as long as they converge to some (possibly incorrect) limit. We conducted a simulation study to evaluate the performance of the aforementioned prediction methods in COVID‐19 trials. Our simulation is based on resampling longitudinal data from over 1500 patients hospitalised with COVID‐19 at Weill Cornell Medicine New York Presbyterian Hospital. We found that using ℓ1$$ {\ell}_1 $$‐regularisation led to estimators and corresponding hypothesis tests that control type 1 error and are more precise than an unadjusted estimator across all sample sizes tested. We also show that when covariates are not prognostic of the outcome, ℓ1$$ {\ell}_1 $$‐regularisation remains as precise as the unadjusted estimator, even at small sample sizes (n=100$$ n=100 $$). We give an R package adjrct that performs model‐robust covariate adjustment for ordinal and time‐to‐event outcomes.

Suggested Citation

Nicholas Williams & Michael Rosenblum & Iván Díaz, 2022. "Optimising precision and power by machine learning in randomised trials with ordinal and time‐to‐event outcomes with an application to COVID‐19," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 185(4), pages 2156-2178, October.

Handle: RePEc:bla:jorssa:v:185:y:2022:i:4:p:2156-2178
DOI: 10.1111/rssa.12915

Download full text from publisher

References listed on IDEAS

D Benkeser & M Carone & M J Van Der Laan & P B Gilbert, 2017. "Doubly robust nonparametric inference on the average treatment effect," Biometrika, Biometrika Trust, vol. 104(4), pages 863-880.
Wright, Marvin N. & Ziegler, Andreas, 2017. "ranger: A Fast Implementation of Random Forests for High Dimensional Data in C++ and R," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 77(i01).
Victor Chernozhukov & Denis Chetverikov & Mert Demirer & Esther Duflo & Christian Hansen & Whitney Newey & James Robins, 2018. "Double/debiased machine learning for treatment and structural parameters," Econometrics Journal, Royal Economic Society, vol. 21(1), pages 1-68, February.
- Victor Chernozhukov & Denis Chetverikov & Mert Demirer & Esther Duflo & Christian Hansen & Whitney Newey & James Robins, 2017. "Double/Debiased Machine Learning for Treatment and Structural Parameters," NBER Working Papers 23564, National Bureau of Economic Research, Inc.
- Victor Chernozhukov & Denis Chetverikov & Mert Demirer & Esther Duflo & Christian Hansen & Whitney K. Newey & James Robins, 2017. "Double/debiased machine learning for treatment and structural parameters," CeMMAP working papers CWP28/17, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
- Victor Chernozhukov & Denis Chetverikov & Mert Demirer & Esther Duflo & Christian Hansen & Whitney K. Newey & James Robins, 2017. "Double/debiased machine learning for treatment and structural parameters," CeMMAP working papers 28/17, Institute for Fiscal Studies.
Oliver Dukes & Stijn Vansteelandt, 2021. "Inference for treatment effect parameters in potentially misspecified high-dimensional models [Approximate residual balancing: Debiased inference of average treatment effects in high dimensions]," Biometrika, Biometrika Trust, vol. 108(2), pages 321-334.
Layla Parast & Lu Tian & Tianxi Cai, 2014. "Landmark Estimation of Survival and Treatment Effect in a Randomized Clinical Trial," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 109(505), pages 384-394, March.
Iván Díaz & Elizabeth Colantuoni & Michael Rosenblum, 2016. "Enhanced precision in the analysis of randomized trials with ordinal outcomes," Biometrics, The International Biometric Society, vol. 72(2), pages 422-431, June.
Pei-Yun Chen & Anastasios A. Tsiatis, 2001. "Causal Inference on the Difference of the Restricted Mean Lifetime Between Two Groups," Biometrics, The International Biometric Society, vol. 57(4), pages 1030-1038, December.
Min Zhang & Anastasios A. Tsiatis & Marie Davidian, 2008. "Improving Efficiency of Inferences in Randomized Clinical Trials Using Auxiliary Covariates," Biometrics, The International Biometric Society, vol. 64(3), pages 707-715, September.
Gruber Susan & van der Laan Mark J., 2012. "Targeted Minimum Loss Based Estimator that Outperforms a given Estimator," The International Journal of Biostatistics, De Gruyter, vol. 8(1), pages 1-22, May.
Andrea Rotnitzky & Quanhong Lei & Mariela Sued & James M. Robins, 2012. "Improved double-robust estimation in missing data and causal inference models," Biometrika, Biometrika Trust, vol. 99(2), pages 439-456.
Yang L. & Tsiatis A. A., 2001. "Efficiency Study of Estimators for a Treatment Effect in a Pretest-Posttest Trial," The American Statistician, American Statistical Association, vol. 55, pages 314-321, November.
Rubin Daniel B & van der Laan Mark J., 2008. "Empirical Efficiency Maximization: Improved Locally Efficient Covariate Adjustment in Randomized Experiments and Survival Analysis," The International Journal of Biostatistics, De Gruyter, vol. 4(1), pages 1-42, May.
Iván Díaz & Elizabeth Colantuoni & Daniel F. Hanley & Michael Rosenblum, 2019. "Improved precision in the analysis of randomized trials with survival outcomes, without assuming proportional hazards," Lifetime Data Analysis: An International Journal Devoted to Statistical Methods and Applications for Time-to-Event Data, Springer, vol. 25(3), pages 439-468, July.

Full references (including those not matched with items on IDEAS)

Most related items

These are the items that most often cite the same works as this one and are cited by the same works as this one.

David Benkeser & Iván Díaz & Alex Luedtke & Jodi Segal & Daniel Scharfstein & Michael Rosenblum, 2021. "Improving precision and power in randomized trials for COVID‐19 treatments using covariate adjustment, for binary, ordinal, and time‐to‐event outcomes," Biometrics, The International Biometric Society, vol. 77(4), pages 1467-1481, December.
Iván Díaz & Elizabeth Colantuoni & Daniel F. Hanley & Michael Rosenblum, 2019. "Improved precision in the analysis of randomized trials with survival outcomes, without assuming proportional hazards," Lifetime Data Analysis: An International Journal Devoted to Statistical Methods and Applications for Time-to-Event Data, Springer, vol. 25(3), pages 439-468, July.
Layla Parast & Beth Ann Griffin, 2017. "Landmark estimation of survival and treatment effects in observational studies," Lifetime Data Analysis: An International Journal Devoted to Statistical Methods and Applications for Time-to-Event Data, Springer, vol. 23(2), pages 161-182, April.
Wang, Qihua & Su, Miaomiao & Wang, Ruoyu, 2021. "A beyond multiple robust approach for missing response problem," Computational Statistics & Data Analysis, Elsevier, vol. 155(C).
Wei Zhang & Zhiwei Zhang & Aiyi Liu, 2023. "Optimizing treatment allocation in randomized clinical trials by leveraging baseline covariates," Biometrics, The International Biometric Society, vol. 79(4), pages 2815-2829, December.
Y Cui & E J Tchetgen Tchetgen, 2024. "Selective machine learning of doubly robust functionals," Biometrika, Biometrika Trust, vol. 111(2), pages 517-535.
Rosenblum Michael & van der Laan Mark J., 2010. "Simple, Efficient Estimators of Treatment Effects in Randomized Trials Using Generalized Linear Models to Leverage Baseline Variables," The International Journal of Biostatistics, De Gruyter, vol. 6(1), pages 1-44, April.
Yuehao Bai & Jizhou Liu & Azeem M. Shaikh & Max Tabord-Meehan, 2023. "On the Efficiency of Finely Stratified Experiments," Papers 2307.15181, arXiv.org, revised Mar 2025.
Chen, Xiaohong & Liu, Ying & Ma, Shujie & Zhang, Zheng, 2024. "Causal inference of general treatment effects using neural networks with a diverging number of confounders," Journal of Econometrics, Elsevier, vol. 238(1).
Victor Chernozhukov & Whitney Newey & Rahul Singh & Vasilis Syrgkanis, 2020. "Adversarial Estimation of Riesz Representers," Papers 2101.00009, arXiv.org, revised Apr 2024.
Bokelmann, Björn & Lessmann, Stefan, 2024. "Improving uplift model evaluation on randomized controlled trial data," European Journal of Operational Research, Elsevier, vol. 313(2), pages 691-707.
Chakravorty, Bhaskar & Arulampalam, Wiji & Bhatiya, Apurav Yash & Imbert, Clément & Rathelot, Roland, 2024. "Can information about jobs improve the effectiveness of vocational training? Experimental evidence from India," Journal of Development Economics, Elsevier, vol. 169(C).
- Chakravorty, Bhaskar & Arulampalam, Wiji & Bhatiya, Apurav Yash & Imbert, Clement & Rathelot, Roland, 2021. "Can information about jobs improve the effectiveness of vocational training? Experimental evidence from India," CAGE Online Working Paper Series 567, Competitive Advantage in the Global Economy (CAGE).
- Chakravorty, Bhaskar & Arulampalam, Wiji & Bhatiya, Apurav Yash & Imbert, Clement & Rathelot, Roland, 2021. "Can information about jobs improve the effectiveness of vocational training? Experimental evidence from India," The Warwick Economics Research Paper Series (TWERPS) 1361, University of Warwick, Department of Economics.
- Chakravorty, Bhaskar & Arulampalam, Wiji & Bhatiya, Apurav Yash & Imbert, Clement & Rathelot, Roland, 2021. "Can Information about Jobs Improve the Effectiveness of Vocational Training? Experimental Evidence from India," IZA Discussion Papers 14427, Institute of Labor Economics (IZA).
Huber, Martin & Meier, Jonas & Wallimann, Hannes, 2022. "Business analytics meets artificial intelligence: Assessing the demand effects of discounts on Swiss train tickets," Transportation Research Part B: Methodological, Elsevier, vol. 163(C), pages 22-39.
- Martin Huber & Jonas Meier & Hannes Wallimann, 2021. "Business analytics meets artificial intelligence: Assessing the demand effects of discounts on Swiss train tickets," Papers 2105.01426, arXiv.org, revised Jun 2022.
Philipp Bach & Victor Chernozhukov & Malte S. Kurz & Martin Spindler & Sven Klaassen, 2021. "DoubleML -- An Object-Oriented Implementation of Double Machine Learning in R," Papers 2103.09603, arXiv.org, revised Jun 2024.
Heejun Shin & Joseph Antonelli, 2023. "Improved inference for doubly robust estimators of heterogeneous treatment effects," Biometrics, The International Biometric Society, vol. 79(4), pages 3140-3152, December.
Shi, Ruoyao, 2024. "An Averaging Estimator For Two-Step M-Estimation In Semiparametric Models," Econometric Theory, Cambridge University Press, vol. 40(3), pages 652-687, June.
- Ruoyao Shi, 2021. "An Averaging Estimator for Two Step M Estimation in Semiparametric Models," Working Papers 202105, University of California at Riverside, Department of Economics.
- Ruoyao Shi, 2022. "An Averaging Estimator for Two Step M Estimation in Semiparametric Models," Working Papers 202201, University of California at Riverside, Department of Economics.
- Ruoyao Shi, 2022. "An Averaging Estimator for Two Step M Estimation in Semiparametric Models," Working Papers 202211, University of California at Riverside, Department of Economics.
Layla Parast & Tianxi Cai & Lu Tian, 2021. "Evaluating multiple surrogate markers with censored data," Biometrics, The International Biometric Society, vol. 77(4), pages 1315-1327, December.
Su, Miaomiao & Wang, Ruoyu & Wang, Qihua, 2022. "A two-stage optimal subsampling estimation for missing data problems with large-scale data," Computational Statistics & Data Analysis, Elsevier, vol. 173(C).
Jad Beyhum & Martin Mugnier, 2024. "Inference after discretizing unobserved heterogeneity," CeMMAP working papers 29/24, Institute for Fiscal Studies.
Qi Gong & Douglas E. Schaubel, 2017. "Estimating the average treatment effect on survival based on observational data and using partly conditional modeling," Biometrics, The International Biometric Society, vol. 73(1), pages 134-144, March.

More about this item

Statistics

Access and download statistics

Corrections

All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bla:jorssa:v:185:y:2022:i:4:p:2156-2178. See general information about how to correct material in RePEc.

If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Wiley Content Delivery (email available below). General contact details of provider: https://edirc.repec.org/data/rssssea.html .

Please note that corrections may take a couple of weeks to filter through the various RePEc services.

IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.

Browse Econ Literature

More features

Optimising precision and power by machine learning in randomised trials with ordinal and time‐to‐event outcomes with an application to COVID‐19

Author

Abstract

Suggested Citation

Download full text from publisher

References listed on IDEAS

Most related items

More about this item

Statistics

Corrections

More services and features

MyIDEAS

Author registration

Rankings

RePEc Genealogy

RePEc Biblio

MPRA

New papers by email

EconAcademics

Plagiarism

About RePEc

RePEc home

Blog

Help/FAQ

RePEc team

Participating archives

Privacy statement

Help us

Corrections

Volunteers

Get papers listed

Open a RePEc archive

Get RePEc data