Debiased Off-Policy Evaluation for Recommendation Systems

My bibliography Save this paper

Debiased Off-Policy Evaluation for Recommendation Systems

Author

Listed:

Yusuke Narita
Shota Yasui
Kohei Yata

Registered:

Yusuke Narita

Abstract

Efficient methods to evaluate new algorithms are critical for improving interactive bandit and reinforcement learning systems such as recommendation systems. A/B tests are reliable, but are time- and money-consuming, and entail a risk of failure. In this paper, we develop an alternative method, which predicts the performance of algorithms given historical data that may have been generated by a different algorithm. Our estimator has the property that its prediction converges in probability to the true performance of a counterfactual algorithm at a rate of $\sqrt{N}$, as the sample size $N$ increases. We also show a correct way to estimate the variance of our prediction, thus allowing the analyst to quantify the uncertainty in the prediction. These properties hold even when the analyst does not know which among a large number of potentially important state variables are actually important. We validate our method by a simulation experiment about reinforcement learning. We finally apply it to improve advertisement design by a major advertisement company. We find that our method produces smaller mean squared errors than state-of-the-art methods.

Suggested Citation

Yusuke Narita & Shota Yasui & Kohei Yata, 2020. "Debiased Off-Policy Evaluation for Recommendation Systems," Papers 2002.08536, arXiv.org, revised Aug 2021.

Handle: RePEc:arx:papers:2002.08536

Download full text from publisher

References listed on IDEAS

Maria Dimakopoulou & Zhengyuan Zhou & Susan Athey & Guido Imbens, 2017. "Estimation Considerations in Contextual Bandits," Papers 1711.07077, arXiv.org, revised Dec 2018.
- Dimakopoulou, Maria & Athey, Susan & Imbens, Guido W., 2018. "Estimation Considerations in Contextual Bandits," Research Papers 3644, Stanford University, Graduate School of Business.
Victor Chernozhukov & Juan Carlos Escanciano & Hidehiko Ichimura & Whitney K. Newey & James M. Robins, 2022. "Locally Robust Semiparametric Estimation," Econometrica, Econometric Society, vol. 90(4), pages 1501-1535, July.
- Victor Chernozhukov & Juan Carlos Escanciano & Hidehiko Ichimura & Whitney K. Newey, 2016. "Locally robust semiparametric estimation," CeMMAP working papers CWP31/16, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
- Victor Chernozhukov & Juan Carlos Escanciano & Hidehiko Ichimura & Whitney K. Newey & James M. Robins, 2018. "Locally robust semiparametric estimation," CeMMAP working papers CWP30/18, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
- Victor Chernozhukov & Juan Carlos Escanciano & Hidehiko Ichimura & Whitney K. Newey, 2016. "Locally robust semiparametric estimation," CeMMAP working papers 31/16, Institute for Fiscal Studies.
- Victor Chernozhukov & Juan Carlos Escanciano & Hidehiko Ichimura & Whitney K. Newey & James M. Robins, 2016. "Locally Robust Semiparametric Estimation," Papers 1608.00033, arXiv.org, revised Aug 2020.
Yusuke Narita & Shota Yasui & Kohei Yata, 2018. "Efficient Counterfactual Learning from Bandit Feedback," Cowles Foundation Discussion Papers 2155, Cowles Foundation for Research in Economics, Yale University.

Full references (including those not matched with items on IDEAS)

Most related items

These are the items that most often cite the same works as this one and are cited by the same works as this one.

Kyle Colangelo & Ying-Ying Lee, 2019. "Double debiased machine learning nonparametric inference with continuous treatments," CeMMAP working papers CWP72/19, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
Kyle Colangelo & Ying-Ying Lee, 2019. "Double debiased machine learning nonparametric inference with continuous treatments," CeMMAP working papers CWP54/19, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
Undral Byambadalai & Tatsushi Oka & Shota Yasui, 2024. "Estimating Distributional Treatment Effects in Randomized Experiments: Machine Learning for Variance Reduction," Papers 2407.16037, arXiv.org.
Victor Chernozhukov & Whitney K. Newey & Victor Quintas-Martinez & Vasilis Syrgkanis, 2021. "Automatic Debiased Machine Learning via Riesz Regression," Papers 2104.14737, arXiv.org, revised Mar 2024.
Zhengyuan Zhou & Susan Athey & Stefan Wager, 2023. "Offline Multi-Action Policy Learning: Generalization and Optimization," Operations Research, INFORMS, vol. 71(1), pages 148-183, January.
- Zhou, Zhengyuan & Athey, Susan & Wager, Stefan, 2018. "Offline Multi-Action Policy Learning: Generalization and Optimization," Research Papers 3734, Stanford University, Graduate School of Business.
- Zhengyuan Zhou & Susan Athey & Stefan Wager, 2018. "Offline Multi-Action Policy Learning: Generalization and Optimization," Papers 1810.04778, arXiv.org, revised Nov 2018.
Maximilian Blesch & Philipp Eisenhauer, 2021. "Robust decision-making under risk and ambiguity," Papers 2104.12573, arXiv.org, revised Oct 2021.
Stéphane Bonhomme & Martin Weidner, 2020. "Minimizing Sensitivity to Model Misspecification," CeMMAP working papers CWP37/20, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
Michael Lechner, 2023. "Causal Machine Learning and its use for public policy," Swiss Journal of Economics and Statistics, Springer;Swiss Society of Economics and Statistics, vol. 159(1), pages 1-15, December.
Stéphane Bonhomme & Martin Weidner, 2022. "Minimizing sensitivity to model misspecification," Quantitative Economics, Econometric Society, vol. 13(3), pages 907-954, July.
Karun Adusumilli & Dita Eckardt, 2019. "Temporal-Difference estimation of dynamic discrete choice models," Papers 1912.09509, arXiv.org, revised Dec 2022.
Thomas H. Jørgensen, 2023. "Sensitivity to Calibrated Parameters," The Review of Economics and Statistics, MIT Press, vol. 105(2), pages 474-481, March.
- Thomas H. J{o}rgensen, 2020. "Sensitivity to Calibrated Parameters," Papers 2004.12100, arXiv.org, revised Mar 2021.
- Thomas H. Jørgensen, 2021. "Sensitivity to Calibrated Parameters," CEBI working paper series 20-14, University of Copenhagen. Department of Economics. The Center for Economic Behavior and Inequality (CEBI).
- Thomas Jorgensen, 2020. "Sensitivity to Calibrated Parameters," CeMMAP working papers CWP16/20, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
Hidehiko Ichimura & Whitney K. Newey, 2022. "The influence function of semiparametric estimators," Quantitative Economics, Econometric Society, vol. 13(1), pages 29-61, January.
- Hidehiko Ichimura & Whitney K. Newey, 2015. "The influence function of semiparametric estimators," CeMMAP working papers 44/15, Institute for Fiscal Studies.
- Hidehiko Ichimura & Whitney K. Newey, 2015. "The influence function of semiparametric estimators," CeMMAP working papers CWP44/15, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
- Hidehiko Ichimura & Whitney K. Newey, 2017. "The influence function of semiparametric estimators," CeMMAP working papers CWP06/17, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
- Hidehiko Ichimura & Whitney K. Newey, 2015. "The Influence Function of Semiparametric Estimators," CIRJE F-Series CIRJE-F-985, CIRJE, Faculty of Economics, University of Tokyo.
- Hidehiko Ichimura & Whitney K. Newey, 2017. "The influence function of semiparametric estimators," CeMMAP working papers 06/17, Institute for Fiscal Studies.
Pereda-Fernández, Santiago, 2023. "Identification and estimation of triangular models with a binary treatment," Journal of Econometrics, Elsevier, vol. 234(2), pages 585-623.
- Santiago Pereda Fernández, 2019. "Identification and estimation of triangular models with a binary treatment," Temi di discussione (Economic working papers) 1210, Bank of Italy, Economic Research and International Relations Area.
Ben Deaner, 2021. "Many Proxy Controls," Papers 2110.03973, arXiv.org.
Semenova, Vira, 2023. "Debiased machine learning of set-identified linear models," Journal of Econometrics, Elsevier, vol. 235(2), pages 1725-1746.
Victor Chernozhukov & Vira Semenova, 2018. "Simultaneous inference for Best Linear Predictor of the Conditional Average Treatment Effect and other structural functions," CeMMAP working papers CWP40/18, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
Sokbae Lee & Ryo Okui & Yoonâ€ Jae Whang, 2017. "Doubly robust uniform confidence band for the conditional average treatment effect function," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 32(7), pages 1207-1225, November.
- Sokbae (Simon) Lee & Ryo Okui & Yoon-Jae Whang, 2016. "Doubly robust uniform confidence band for the conditional average treatment effect function," CeMMAP working papers CWP03/16, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
- Lee, Sokbae & Okui, Ryo & Whang, Yoon-Jae, 2017. "Doubly robust uniform confidence band for the conditional average treatment effect function," LSE Research Online Documents on Economics 86852, London School of Economics and Political Science, LSE Library.
- Sokbae Lee & Ryo Okui & Yoon-Jae Whang, 2016. "Doubly Robust Uniform Confidence Band For The Conditional Average Treatment Effect Function," KIER Working Papers 931, Kyoto University, Institute of Economic Research.
- Sokbae Lee & Ryo Okui & Yoon-Jae Whang, 2016. "Doubly Robust Uniform Confidence Band for the Conditional Average Treatment Effect Function," Papers 1601.02801, arXiv.org, revised Oct 2016.
- Sokbae (Simon) Lee & Ryo Okui & Yoon-Jae Whang, 2016. "Doubly robust uniform confidence band for the conditional average treatment effect function," CeMMAP working papers 03/16, Institute for Fiscal Studies.
Victor Chernozhukov & Carlos Cinelli & Whitney Newey & Amit Sharma & Vasilis Syrgkanis, 2021. "Long Story Short: Omitted Variable Bias in Causal Machine Learning," Papers 2112.13398, arXiv.org, revised May 2024.
- Victor Chernozhukov & Carlos Cinelli & Whitney Newey & Amit Sharma & Vasilis Syrgkanis, 2022. "Long Story Short: Omitted Variable Bias in Causal Machine Learning," NBER Working Papers 30302, National Bureau of Economic Research, Inc.
Esfandiar Maasoumi & Jianqiu Wang & Zhuo Wang & Ke Wu, 2024. "Identifying factors via automatic debiased machine learning," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 39(3), pages 438-461, April.
Kim, Bora & Lee, Myoung-jae, 2024. "Instrument-residual estimator for multi-valued instruments under full monotonicity," Statistics & Probability Letters, Elsevier, vol. 213(C).

More about this item

NEP fields

This paper has been announced in the following NEP Reports:

NEP-BIG-2020-03-09 (Big Data)
NEP-CMP-2020-03-09 (Computational Economics)

Statistics

Access and download statistics

Corrections

All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2002.08536. See general information about how to correct material in RePEc.

If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

Please note that corrections may take a couple of weeks to filter through the various RePEc services.

IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.

Browse Econ Literature

More features

Debiased Off-Policy Evaluation for Recommendation Systems

Author

Abstract

Suggested Citation

Download full text from publisher

References listed on IDEAS

Most related items

More about this item

NEP fields

Statistics

Corrections

More services and features

MyIDEAS

Author registration

Rankings

RePEc Genealogy

RePEc Biblio

MPRA

New papers by email

EconAcademics

Plagiarism

About RePEc

RePEc home

Blog

Help/FAQ

RePEc team

Participating archives

Privacy statement

Help us

Corrections

Volunteers

Get papers listed

Open a RePEc archive

Get RePEc data