Bayesian Learning of Noisy Markov Decision Processes

My bibliography Save this paper

Bayesian Learning of Noisy Markov Decision Processes

Author

Listed:

Sumeetpal S. Singh
(Crest)
Nicolas Chopin
(Crest)
Nick Whiteley
(Crest)

Registered:

Nicolas Chopin

Abstract

This work addresses the problem of estimating the optimal value function in a MarkovDecision Process from observed state-action pairs. We adopt a Bayesian approach toinference, which allows both the model to be estimated and predictions about actions tobe made in a unified framework, providing a principled approach to mimicry of a controlleron the basis of observed data. A new Markov chain Monte Carlo (MCMC) sampler isdevised for simulation from the posterior distribution over the optimal value function.This step includes a parameter expansion step, which is shown to be essential for goodconvergence properties of the MCMC sampler. As an illustration, the method is appliedto learning a human controller.

Suggested Citation

Sumeetpal S. Singh & Nicolas Chopin & Nick Whiteley, 2010. "Bayesian Learning of Noisy Markov Decision Processes," Working Papers 2010-36, Center for Research in Economics and Statistics.

Handle: RePEc:crs:wpaper:2010-36

Download full text from publisher

References listed on IDEAS

Rust, John, 1987. "Optimal Replacement of GMC Bus Engines: An Empirical Model of Harold Zurcher," Econometrica, Econometric Society, vol. 55(5), pages 999-1033, September.
Susumu Imai & Neelam Jain & Andrew Ching, 2009. "Bayesian Estimation of Dynamic Discrete Choice Models," Econometrica, Econometric Society, vol. 77(6), pages 1865-1899, November.
- Susumu Imai & Neelam Jain, 2005. "Bayesian Estimation of Dynamic Discrete Choice Models," 2005 Meeting Papers 432, Society for Economic Dynamics.
- Andrew Ching & Susumu Imai & Neelam Jain, 2006. "Bayesian Estimation Of Dynamic Discrete Choice Models," Working Paper 1118, Economics Department, Queen's University.
Wolpin, Kenneth I, 1984. "An Estimable Dynamic Stochastic Model of Fertility and Child Mortality," Journal of Political Economy, University of Chicago Press, vol. 92(5), pages 852-874, October.
Victor Aguirregabiria & Pedro Mira, 2002. "Swapping the Nested Fixed Point Algorithm: A Class of Estimators for Discrete Markov Decision Models," Econometrica, Econometric Society, vol. 70(4), pages 1519-1543, July.
- Victor Aguirregabiria & Pedro Mira, 1999. "Swapping the Nested Fixed-Point Algorithm: a Class of Estimators for Discrete Markov Decision Models," Computing in Economics and Finance 1999 332, Society for Computational Economics.
- Víctor Aguirregabiria & Pedro Mira, 1999. "Swapping the Nested Fixed Point Algorithm: A Class of Estimators for Discrete Markov Decision Models," Working Papers wp1999_9904, CEMFI.
Geweke, John & Keane, Michael P & Runkle, David, 1994. "Alternative Computational Approaches to Inference in the Multinomial Probit Model," The Review of Economics and Statistics, MIT Press, vol. 76(4), pages 609-632, November.
- John Geweke & Michael P. Keane & David E. Runkle, 1994. "Alternative computational approaches to inference in the multinomial probit model," Staff Report 170, Federal Reserve Bank of Minneapolis.
McCulloch, Robert E. & Polson, Nicholas G. & Rossi, Peter E., 2000. "A Bayesian analysis of the multinomial probit model with fully identified parameters," Journal of Econometrics, Elsevier, vol. 99(1), pages 173-193, November.
V. Joseph Hotz & Robert A. Miller, 1993. "Conditional Choice Probabilities and the Estimation of Dynamic Models," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 60(3), pages 497-529.
- Hotz, V.J. & Miller, R.A., 1991. "Conditional Choice Probabilities and the Estimation of Dynamic Models," GSIA Working Papers 1992-12, Carnegie Mellon University, Tepper School of Business.
- V. Joseph Hotz & Robert A. Miller, 1992. "Conditional Choice Probabilities and the Estimation of Dynamic Models," Working Papers 9202, Harris School of Public Policy Studies, University of Chicago.
McCulloch, Robert & Rossi, Peter E., 1994. "An exact likelihood analysis of the multinomial probit model," Journal of Econometrics, Elsevier, vol. 64(1-2), pages 207-240.
Imai, Kosuke & van Dyk, David A., 2005. "A Bayesian analysis of the multinomial probit model using marginal data augmentation," Journal of Econometrics, Elsevier, vol. 124(2), pages 311-334, February.
Gotz, Glenn A. & McCall, John J., 1980. "Estimation in sequential decisionmaking models : A methodological note," Economics Letters, Elsevier, vol. 6(2), pages 131-136.
Mariano,Roberto & Schuermann,Til & Weeks,Melvyn J. (ed.), 2000. "Simulation-based Inference in Econometrics," Cambridge Books, Cambridge University Press, number 9780521591126, Enero-Abr.

Full references (including those not matched with items on IDEAS)

Most related items

These are the items that most often cite the same works as this one and are cited by the same works as this one.

Andrew Ching & Susumu Imai & Masakazu Ishihara & Neelam Jain, 2012. "A practitioner’s guide to Bayesian estimation of discrete choice dynamic programming models," Quantitative Marketing and Economics (QME), Springer, vol. 10(2), pages 151-196, June.
- Andrew Ching & Susumu Imai & Masakazu Ishihara & Neelam Jain, 2009. "A Practitioner's Guide To Bayesian Estimation Of Discrete Choice Dynamic Programming Models," Working Paper 1201, Economics Department, Queen's University.
Kasahara, Hiroyuki & Shimotsu, Katsumi, 2008. "Pseudo-likelihood estimation and bootstrap inference for structural discrete Markov decision models," Journal of Econometrics, Elsevier, vol. 146(1), pages 92-106, September.
Aguirregabiria, Victor & Mira, Pedro, 2010. "Dynamic discrete choice structural models: A survey," Journal of Econometrics, Elsevier, vol. 156(1), pages 38-67, May.
- Victor Aguirregabiria & Pedro mira, 2007. "Dynamic Discrete Choice Structural Models: A Survey," Working Papers tecipa-297, University of Toronto, Department of Economics.
- Víctor Aguirregabiria & Pedro Mira, 2007. "Dynamic Discrete Choice Structural Models: A Survey," Working Papers wp2007_0711, CEMFI.
Hiroyuki Kasahara & Katsumi Shimotsu, 2018. "Estimation of Discrete Choice Dynamic Programming Models," The Japanese Economic Review, Japanese Economic Association, vol. 69(1), pages 28-58, March.
- Hiroyuki Kasahara & Katsumi Shimotsu, 2018. "Estimation of Discrete Choice Dynamic Programming Models," The Japanese Economic Review, Springer, vol. 69(1), pages 28-58, March.
Daniel Ackerberg, 2009. "A new use of importance sampling to reduce computational burden in simulation estimation," Quantitative Marketing and Economics (QME), Springer, vol. 7(4), pages 343-376, December.
- Daniel A. Ackerberg, 2001. "A New Use of Importance Sampling to Reduce Computational Burden in Simulation Estimation," NBER Technical Working Papers 0273, National Bureau of Economic Research, Inc.
Ricardo A. Daziano & Martin Achtnicht, 2014. "Forecasting Adoption of Ultra-Low-Emission Vehicles Using Bayes Estimates of a Multinomial Probit Model and the GHK Simulator," Transportation Science, INFORMS, vol. 48(4), pages 671-683, November.
Christopher Ferrall, 2023. "Object Oriented (Dynamic) Programming: Closing the “Structural” Estimation Coding Gap," Computational Economics, Springer;Society for Computational Economics, vol. 62(3), pages 761-816, October.
Aguirregabiria, Victor & Magesan, Arvind, 2013. "Euler Equations for the Estimation of Dynamic Discrete Choice Structural," MPRA Paper 46056, University Library of Munich, Germany.
Andriy Norets, 2010. "Continuity and differentiability of expected value functions in dynamic discrete choice models," Quantitative Economics, Econometric Society, vol. 1(2), pages 305-322, November.
Peter Arcidiacono & Robert A. Miller, 2011. "Conditional Choice Probability Estimation of Dynamic Discrete Choice Models With Unobserved Heterogeneity," Econometrica, Econometric Society, vol. 79(6), pages 1823-1867, November.
George‐Levi Gayle & Limor Golan & Mehmet A. Soytas, 2018. "Estimation of dynastic life‐cycle discrete choice models," Quantitative Economics, Econometric Society, vol. 9(3), pages 1195-1241, November.
- George-Levi Gayle & Limor Golan & Mehmet A. Soytas, 2015. "Estimation of Dynastic Life-Cycle Discrete Choice Models," Working Papers 2015-20, Federal Reserve Bank of St. Louis.
Ji, Yongjie & Rabotyagov, Sergey & Kling, Catherine L., 2014. "Crop Choice and Rotational Effects: A Dynamic Model of Land Use in Iowa in Recent Years," 2014 Annual Meeting, July 27-29, 2014, Minneapolis, Minnesota 170366, Agricultural and Applied Economics Association.
Hanming Fang & Yang Wang, 2015. "Estimating Dynamic Discrete Choice Models With Hyperbolic Discounting, With An Application To Mammography Decisions," International Economic Review, Department of Economics, University of Pennsylvania and Osaka University Institute of Social and Economic Research Association, vol. 56(2), pages 565-596, May.
- Hanming Fang & Yang Wang, 2010. "Estimating Dynamic Discrete Choice Models with Hyperbolic Discounting, with an Application to Mammography Decisions," PIER Working Paper Archive 10-033, Penn Institute for Economic Research, Department of Economics, University of Pennsylvania.
- Hanming Fang & Yang Wang, 2010. "Estimating Dynamic Discrete Choice Models with Hyperbolic Discounting, with an Application to Mammography Decisions," NBER Working Papers 16438, National Bureau of Economic Research, Inc.
Amoroso, S., 2013. "Heterogeneity of innovative, collaborative, and productive firm-level processes," Other publications TiSEM f5784a49-7053-401d-855d-1, Tilburg University, School of Economics and Management.
Arthur Charpentier & Romuald Élie & Carl Remlinger, 2023. "Reinforcement Learning in Economics and Finance," Computational Economics, Springer;Society for Computational Economics, vol. 62(1), pages 425-462, June.
Nikhil Agarwal & Itai Ashlagi & Michael A. Rees & Paulo Somaini & Daniel Waldinger, 2021. "Equilibrium Allocations Under Alternative Waitlist Designs: Evidence From Deceased Donor Kidneys," Econometrica, Econometric Society, vol. 89(1), pages 37-76, January.
Patrick Bajari & C. Lanier Benkard & Jonathan Levin, 2007. "Estimating Dynamic Models of Imperfect Competition," Econometrica, Econometric Society, vol. 75(5), pages 1331-1370, September.
- Jonathan Levin (Stanford University) & Pat Bajari & Lanier Benkard, 2004. "Estimating Dynamic Models of Imperfect Competition," Econometric Society 2004 North American Winter Meetings 627, Econometric Society.
- J. Levin & P. Bajari, 2004. "Estimating Dynamic Models of Imperfect Competition," 2004 Meeting Papers 579, Society for Economic Dynamics.
- Bajari, Patrick & Benkard, C. Lanier & Levin, Jonathan, 2007. "Estimating Dynamic Models of Imperfect Competition," Research Papers 1852r1, Stanford University, Graduate School of Business.
- Patrick Bajari & C. Lanier Benkard & Jonathan Levin, 2004. "Estimating Dynamic Models of Imperfect Competition," NBER Working Papers 10450, National Bureau of Economic Research, Inc.
Houser, Daniel, 2003. "Bayesian analysis of a dynamic stochastic model of labor supply and saving," Journal of Econometrics, Elsevier, vol. 113(2), pages 289-335, April.
Raja Chakir & Olivier Parent, 2009. "Determinants of land use changes: A spatial multinomial probit approach," Papers in Regional Science, Wiley Blackwell, vol. 88(2), pages 327-344, June.
- Olivier Parent & Raja Chakir, 2008. "Determinants of land use changes: a spatial multinomial probit approach," University of Cincinnati, Economics Working Papers Series 2008-06, University of Cincinnati, Department of Economics.
Blevins, Jason R. & Kim, Minhae, 2024. "Nested Pseudo likelihood estimation of continuous-time dynamic discrete games," Journal of Econometrics, Elsevier, vol. 238(2).
- Jason R. Blevins & Minhae Kim, 2021. "Nested Pseudo Likelihood Estimation of Continuous-Time Dynamic Discrete Games," Papers 2108.02182, arXiv.org, revised Jan 2023.

More about this item

Statistics

Access and download statistics

Corrections

All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:crs:wpaper:2010-36. See general information about how to correct material in RePEc.

If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Secretariat General (email available below). General contact details of provider: https://edirc.repec.org/data/crestfr.html .

Please note that corrections may take a couple of weeks to filter through the various RePEc services.

IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.

Browse Econ Literature

More features

Bayesian Learning of Noisy Markov Decision Processes

Author

Abstract

Suggested Citation

Download full text from publisher

References listed on IDEAS

Most related items

More about this item

Statistics

Corrections

More services and features

MyIDEAS

Author registration

Rankings

RePEc Genealogy

RePEc Biblio

MPRA

New papers by email

EconAcademics

Plagiarism

About RePEc

RePEc home

Blog

Help/FAQ

RePEc team

Participating archives

Privacy statement

Help us

Corrections

Volunteers

Get papers listed

Open a RePEc archive

Get RePEc data