Inverse Reinforcement Learning Using Just Classification and a Few Regressions

My bibliography Save this paper

Inverse Reinforcement Learning Using Just Classification and a Few Regressions

Author

Listed:

Lars van der Laan
Nathan Kallus
Aur'elien Bibaut

Registered:

Abstract

Inverse reinforcement learning (IRL) aims to explain observed behavior by uncovering an underlying reward. In the maximum-entropy or Gumbel-shocks-to-reward frameworks, this amounts to fitting a reward function and a soft value function that together satisfy the soft Bellman consistency condition and maximize the likelihood of observed actions. While this perspective has had enormous impact in imitation learning for robotics and understanding dynamic choices in economics, practical learning algorithms often involve delicate inner-loop optimization, repeated dynamic programming, or adversarial training, all of which complicate the use of modern, highly expressive function approximators like neural nets and boosting. We revisit softmax IRL and show that the population maximum-likelihood solution is characterized by a linear fixed-point equation involving the behavior policy. This observation reduces IRL to two off-the-shelf supervised learning problems: probabilistic classification to estimate the behavior policy, and iterative regression to solve the fixed point. The resulting method is simple and modular across function approximation classes and algorithms. We provide a precise characterization of the optimal solution, a generic oracle-based algorithm, finite-sample error bounds, and empirical results showing competitive or superior performance to MaxEnt IRL.

Suggested Citation

Lars van der Laan & Nathan Kallus & Aur'elien Bibaut, 2025. "Inverse Reinforcement Learning Using Just Classification and a Few Regressions," Papers 2509.21172, arXiv.org.

Handle: RePEc:arx:papers:2509.21172

Download full text from publisher

References listed on IDEAS

Aguirregabiria, Victor & Mira, Pedro, 2010. "Dynamic discrete choice structural models: A survey," Journal of Econometrics, Elsevier, vol. 156(1), pages 38-67, May.
- Victor Aguirregabiria & Pedro mira, 2007. "Dynamic Discrete Choice Structural Models: A Survey," Working Papers tecipa-297, University of Toronto, Department of Economics.
- Víctor Aguirregabiria & Pedro Mira, 2007. "Dynamic Discrete Choice Structural Models: A Survey," Working Papers wp2007_0711, CEMFI.
V. Joseph Hotz & Robert A. Miller & Seth Sanders & Jeffrey Smith, 1994. "A Simulation Estimator for Dynamic Models of Discrete Choice," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 61(2), pages 265-289.
- Hotz, J.V. & Miller, R.A. & Sanders, S. & Smith, J., 1992. "A Simulation Estimator for Dynamic Models of Discrete Choice," GSIA Working Papers 1992-13, Carnegie Mellon University, Tepper School of Business.
- V. Joseph Hotz & Robert A. Miller & Seth Sanders & Jeffrey Smith, 1992. "A Simulation Estimator for Dynamic Models of Discrete Choice," Working Papers 9205, Harris School of Public Policy Studies, University of Chicago.
Thierry Magnac & David Thesmar, 2002. "Identifying Dynamic Discrete Decision Processes," Econometrica, Econometric Society, vol. 70(2), pages 801-816, March.
- T. Magnac & D. Thesmar, 2002. "Identifying dynamic discrete decision processes [[Identification d'un processus de décision discret dynamique]]," Post-Print hal-02671242, HAL.
V. Joseph Hotz & Robert A. Miller, 1993. "Conditional Choice Probabilities and the Estimation of Dynamic Models," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 60(3), pages 497-529.
- Hotz, V.J. & Miller, R.A., 1991. "Conditional Choice Probabilities and the Estimation of Dynamic Models," GSIA Working Papers 1992-12, Carnegie Mellon University, Tepper School of Business.
- V. Joseph Hotz & Robert A. Miller, 1992. "Conditional Choice Probabilities and the Estimation of Dynamic Models," Working Papers 9202, Harris School of Public Policy Studies, University of Chicago.
Yichun Hu & Nathan Kallus & Masatoshi Uehara, 2025. "Fast Rates for the Regret of Offline Reinforcement Learning," Mathematics of Operations Research, INFORMS, vol. 50(1), pages 633-655, February.
Peter Arcidiacono & Paul B. Ellickson, 2011. "Practical Methods for Estimation of Dynamic Discrete Choice Models," Annual Review of Economics, Annual Reviews, vol. 3(1), pages 363-394, September.

Full references (including those not matched with items on IDEAS)

Most related items

These are the items that most often cite the same works as this one and are cited by the same works as this one.

Hu, Yingyao & Xin, Yi, 2024. "Identification and estimation of dynamic structural models with unobserved choices," Journal of Econometrics, Elsevier, vol. 242(2).
Sebastian Galiani & Juan Pantano, 2021. "Structural Models: Inception and Frontier," NBER Working Papers 28698, National Bureau of Economic Research, Inc.
Haoying Wang & Guohui Wu, 2022. "Modeling discrete choices with large fine-scale spatial data: opportunities and challenges," Journal of Geographical Systems, Springer, vol. 24(3), pages 325-351, July.
Joao Macieira, 2010. "Oblivious Equilibrium in Dynamic Discrete Games," 2010 Meeting Papers 680, Society for Economic Dynamics.
Olivier De Groote, 2025. "Dynamic Effort Choice in High School: Costs and Benefits of an Academic Track," Journal of Labor Economics, University of Chicago Press, vol. 43(2), pages 467-502.
- De Groote, Olivier, 2019. "Dynamic Effort Choice in High School: Costs and Benefits of an Academic Track," TSE Working Papers 19-1002, Toulouse School of Economics (TSE), revised Jun 2023.
- Olivier de Groote, 2025. "Dynamic effort choice in high school: costs and benefits of an academic track," Post-Print hal-05027246, HAL.
- Olivier de Groote, 2023. "Dynamic Effort Choice in High School: Costs and Benefits of an Academic Track," Working Papers hal-04953656, HAL.
Arcidiacono, Peter & Miller, Robert A., 2020. "Identifying dynamic discrete choice models off short panels," Journal of Econometrics, Elsevier, vol. 215(2), pages 473-485.
Cheng Chou & Geert Ridder & Ruoyao Shi, 2024. "Identification and Estimation of Nonstationary Dynamic Binary Choice Models," Working Papers 202402, University of California at Riverside, Department of Economics.
Myrto Kalouptsidi & Paul T. Scott & Eduardo Souza-Rodrigues, 2018. "Linear IV Regression Estimators for Structural Dynamic Discrete Choice Models," NBER Working Papers 25134, National Bureau of Economic Research, Inc.
Schiraldi, Pasquale & Levy, Matthew R., 2021. "Identification of Dynamic Discrete-Continuous Choice Models, with an Application to Consumption-Savings-Retirement," CEPR Discussion Papers 15719, C.E.P.R. Discussion Papers.
Rambha, Tarun & Nozick, Linda K. & Davidson, Rachel, 2021. "Modeling hurricane evacuation behavior using a dynamic discrete choice framework," Transportation Research Part B: Methodological, Elsevier, vol. 150(C), pages 75-100.
Pamela Giustinelli & Matthew D. Shapiro, 2024. "SeaTE: Subjective Ex Ante Treatment Effect of Health on Retirement," American Economic Journal: Applied Economics, American Economic Association, vol. 16(2), pages 278-317, April.
- Pamela Giustinelli & Matthew D. Shapiro, 2018. "SeaTE: Subjective ex ante Treatment Effect of Health on Retirement," Working Papers wp382, University of Michigan, Michigan Retirement Research Center.
- Pamela Giustinelli & Matthew D. Shapiro, 2019. "SeaTE: Subjective ex ante Treatment Effect of Health on Retirement," NBER Working Papers 26087, National Bureau of Economic Research, Inc.
Kalouptsidi, Myrto & Scott, Paul T. & Souza-Rodrigues, Eduardo, 2021. "Linear IV regression estimators for structural dynamic discrete choice models," Journal of Econometrics, Elsevier, vol. 222(1), pages 778-804.
repec:spo:wpmain:info:hdl:2441/7svo6civd6959qvmn4965cth1d is not listed on IDEAS
Khai Chiong & Alfred Galichon & Matt Shum, 2015. "Duality in Dynamic Discrete Choice Models," SciencePo Working papers Main hal-03568184, HAL.
Bruneel-Zupanc, Christophe Alain, 2021. "Discrete-Continuous Dynamic Choice Models: Identification and Conditional Choice Probability Estimation," TSE Working Papers 21-1185, Toulouse School of Economics (TSE).
An, Yonghong & Hu, Yingyao & Xiao, Ruli, 2021. "Dynamic decisions under subjective expectations: A structural analysis," Journal of Econometrics, Elsevier, vol. 222(1), pages 645-675.
- Yonghong An & Yingyao Hu & Ruli Xiao, 2018. "Dynamic Decisions under Subjective Expectations: A Structural Analysis," CAEPR Working Papers 2018-001, Center for Applied Economics and Policy Research, Department of Economics, Indiana University Bloomington.
- Yonghong An & Yingyao Hu & Ruli Xiao, 2018. "Dynamic decisions under subjective expectations: a structural analysis," CeMMAP working papers CWP11/18, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
Khai Xiang Chiong & Alfred Galichon & Matt Shum, 2021. "Duality in dynamic discrete-choice models," Papers 2102.06076, arXiv.org, revised Feb 2021.
Fabio A. Miessi Sanches & Daniel Silva Junior, Sorawoot Srisuma, 2014. "Ordinary Least Squares Estimation for a Dynamic Game," Working Papers, Department of Economics 2014_19, University of São Paulo (FEA-USP), revised 23 Feb 2015.
Khai Chiong & Alfred Galichon & Matt Shum, 2015. "Duality in Dynamic Discrete Choice Models," SciencePo Working papers hal-03568184, HAL.
Kalouptsidi, Myrto & Scott, Paul T. & Souza-Rodrigues, Eduardo, 2018. "Linear IV Regression Estimators for Structural Dynamic Discrete Choice Models," CEPR Discussion Papers 13240, C.E.P.R. Discussion Papers.
Khai Chiong & Alfred Galichon & Matt Shum, 2015. "Duality in Dynamic Discrete Choice Models," Post-Print hal-03568184, HAL.

More about this item

NEP fields

This paper has been announced in the following NEP Reports:

NEP-CMP-2025-10-20 (Computational Economics)
NEP-INV-2025-10-20 (Investment)

Statistics

Access and download statistics

Corrections

All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2509.21172. See general information about how to correct material in RePEc.

If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

Please note that corrections may take a couple of weeks to filter through the various RePEc services.

IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.

Browse Econ Literature

More features

Inverse Reinforcement Learning Using Just Classification and a Few Regressions

Author

Abstract

Suggested Citation

Download full text from publisher

References listed on IDEAS

Most related items

More about this item

NEP fields

Statistics

Corrections

More services and features

MyIDEAS

Author registration

Rankings

RePEc Genealogy

RePEc Biblio

MPRA

New papers by email

EconAcademics

Plagiarism

About RePEc

RePEc home

Blog

Help/FAQ

RePEc team

Participating archives

Privacy statement

Help us

Corrections

Volunteers

Get papers listed

Open a RePEc archive

Get RePEc data