IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2509.21172.html
   My bibliography  Save this paper

Inverse Reinforcement Learning Using Just Classification and a Few Regressions

Author

Listed:
  • Lars van der Laan
  • Nathan Kallus
  • Aur'elien Bibaut

Abstract

Inverse reinforcement learning (IRL) aims to explain observed behavior by uncovering an underlying reward. In the maximum-entropy or Gumbel-shocks-to-reward frameworks, this amounts to fitting a reward function and a soft value function that together satisfy the soft Bellman consistency condition and maximize the likelihood of observed actions. While this perspective has had enormous impact in imitation learning for robotics and understanding dynamic choices in economics, practical learning algorithms often involve delicate inner-loop optimization, repeated dynamic programming, or adversarial training, all of which complicate the use of modern, highly expressive function approximators like neural nets and boosting. We revisit softmax IRL and show that the population maximum-likelihood solution is characterized by a linear fixed-point equation involving the behavior policy. This observation reduces IRL to two off-the-shelf supervised learning problems: probabilistic classification to estimate the behavior policy, and iterative regression to solve the fixed point. The resulting method is simple and modular across function approximation classes and algorithms. We provide a precise characterization of the optimal solution, a generic oracle-based algorithm, finite-sample error bounds, and empirical results showing competitive or superior performance to MaxEnt IRL.

Suggested Citation

  • Lars van der Laan & Nathan Kallus & Aur'elien Bibaut, 2025. "Inverse Reinforcement Learning Using Just Classification and a Few Regressions," Papers 2509.21172, arXiv.org.
  • Handle: RePEc:arx:papers:2509.21172
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2509.21172
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Aguirregabiria, Victor & Mira, Pedro, 2010. "Dynamic discrete choice structural models: A survey," Journal of Econometrics, Elsevier, vol. 156(1), pages 38-67, May.
    2. V. Joseph Hotz & Robert A. Miller & Seth Sanders & Jeffrey Smith, 1994. "A Simulation Estimator for Dynamic Models of Discrete Choice," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 61(2), pages 265-289.
    3. Thierry Magnac & David Thesmar, 2002. "Identifying Dynamic Discrete Decision Processes," Econometrica, Econometric Society, vol. 70(2), pages 801-816, March.
    4. V. Joseph Hotz & Robert A. Miller, 1993. "Conditional Choice Probabilities and the Estimation of Dynamic Models," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 60(3), pages 497-529.
    5. Yichun Hu & Nathan Kallus & Masatoshi Uehara, 2025. "Fast Rates for the Regret of Offline Reinforcement Learning," Mathematics of Operations Research, INFORMS, vol. 50(1), pages 633-655, February.
    6. Peter Arcidiacono & Paul B. Ellickson, 2011. "Practical Methods for Estimation of Dynamic Discrete Choice Models," Annual Review of Economics, Annual Reviews, vol. 3(1), pages 363-394, September.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Hu, Yingyao & Xin, Yi, 2024. "Identification and estimation of dynamic structural models with unobserved choices," Journal of Econometrics, Elsevier, vol. 242(2).
    2. Sebastian Galiani & Juan Pantano, 2021. "Structural Models: Inception and Frontier," NBER Working Papers 28698, National Bureau of Economic Research, Inc.
    3. Haoying Wang & Guohui Wu, 2022. "Modeling discrete choices with large fine-scale spatial data: opportunities and challenges," Journal of Geographical Systems, Springer, vol. 24(3), pages 325-351, July.
    4. Joao Macieira, 2010. "Oblivious Equilibrium in Dynamic Discrete Games," 2010 Meeting Papers 680, Society for Economic Dynamics.
    5. Olivier De Groote, 2025. "Dynamic Effort Choice in High School: Costs and Benefits of an Academic Track," Journal of Labor Economics, University of Chicago Press, vol. 43(2), pages 467-502.
    6. Arcidiacono, Peter & Miller, Robert A., 2020. "Identifying dynamic discrete choice models off short panels," Journal of Econometrics, Elsevier, vol. 215(2), pages 473-485.
    7. Cheng Chou & Geert Ridder & Ruoyao Shi, 2024. "Identification and Estimation of Nonstationary Dynamic Binary Choice Models," Working Papers 202402, University of California at Riverside, Department of Economics.
    8. Myrto Kalouptsidi & Paul T. Scott & Eduardo Souza-Rodrigues, 2018. "Linear IV Regression Estimators for Structural Dynamic Discrete Choice Models," NBER Working Papers 25134, National Bureau of Economic Research, Inc.
    9. Schiraldi, Pasquale & Levy, Matthew R., 2021. "Identification of Dynamic Discrete-Continuous Choice Models, with an Application to Consumption-Savings-Retirement," CEPR Discussion Papers 15719, C.E.P.R. Discussion Papers.
    10. Rambha, Tarun & Nozick, Linda K. & Davidson, Rachel, 2021. "Modeling hurricane evacuation behavior using a dynamic discrete choice framework," Transportation Research Part B: Methodological, Elsevier, vol. 150(C), pages 75-100.
    11. Pamela Giustinelli & Matthew D. Shapiro, 2024. "SeaTE: Subjective Ex Ante Treatment Effect of Health on Retirement," American Economic Journal: Applied Economics, American Economic Association, vol. 16(2), pages 278-317, April.
    12. Kalouptsidi, Myrto & Scott, Paul T. & Souza-Rodrigues, Eduardo, 2021. "Linear IV regression estimators for structural dynamic discrete choice models," Journal of Econometrics, Elsevier, vol. 222(1), pages 778-804.
    13. repec:spo:wpmain:info:hdl:2441/7svo6civd6959qvmn4965cth1d is not listed on IDEAS
    14. Khai Chiong & Alfred Galichon & Matt Shum, 2015. "Duality in Dynamic Discrete Choice Models," SciencePo Working papers Main hal-03568184, HAL.
    15. Bruneel-Zupanc, Christophe Alain, 2021. "Discrete-Continuous Dynamic Choice Models: Identification and Conditional Choice Probability Estimation," TSE Working Papers 21-1185, Toulouse School of Economics (TSE).
    16. An, Yonghong & Hu, Yingyao & Xiao, Ruli, 2021. "Dynamic decisions under subjective expectations: A structural analysis," Journal of Econometrics, Elsevier, vol. 222(1), pages 645-675.
    17. Khai Xiang Chiong & Alfred Galichon & Matt Shum, 2021. "Duality in dynamic discrete-choice models," Papers 2102.06076, arXiv.org, revised Feb 2021.
    18. Fabio A. Miessi Sanches & Daniel Silva Junior, Sorawoot Srisuma, 2014. "Ordinary Least Squares Estimation for a Dynamic Game," Working Papers, Department of Economics 2014_19, University of São Paulo (FEA-USP), revised 23 Feb 2015.
    19. Khai Chiong & Alfred Galichon & Matt Shum, 2015. "Duality in Dynamic Discrete Choice Models," SciencePo Working papers hal-03568184, HAL.
    20. Kalouptsidi, Myrto & Scott, Paul T. & Souza-Rodrigues, Eduardo, 2018. "Linear IV Regression Estimators for Structural Dynamic Discrete Choice Models," CEPR Discussion Papers 13240, C.E.P.R. Discussion Papers.
    21. Khai Chiong & Alfred Galichon & Matt Shum, 2015. "Duality in Dynamic Discrete Choice Models," Post-Print hal-03568184, HAL.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2509.21172. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.