IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2008.07820.html
   My bibliography  Save this paper

A Relation Analysis of Markov Decision Process Frameworks

Author

Listed:
  • Tien Mai
  • Patrick Jaillet

Abstract

We study the relation between different Markov Decision Process (MDP) frameworks in the machine learning and econometrics literatures, including the standard MDP, the entropy and general regularized MDP, and stochastic MDP, where the latter is based on the assumption that the reward function is stochastic and follows a given distribution. We show that the entropy-regularized MDP is equivalent to a stochastic MDP model, and is strictly subsumed by the general regularized MDP. Moreover, we propose a distributional stochastic MDP framework by assuming that the distribution of the reward function is ambiguous. We further show that the distributional stochastic MDP is equivalent to the regularized MDP, in the sense that they always yield the same optimal policies. We also provide a connection between stochastic/regularized MDP and constrained MDP. Our work gives a unified view on several important MDP frameworks, which would lead new ways to interpret the (entropy/general) regularized MDP frameworks through the lens of stochastic rewards and vice-versa. Given the recent popularity of regularized MDP in (deep) reinforcement learning, our work brings new understandings of how such algorithmic schemes work and suggest ideas to develop new ones.

Suggested Citation

  • Tien Mai & Patrick Jaillet, 2020. "A Relation Analysis of Markov Decision Process Frameworks," Papers 2008.07820, arXiv.org.
  • Handle: RePEc:arx:papers:2008.07820
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2008.07820
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Anas, Alex, 1983. "Discrete choice theory, information theory and the multinomial logit and gravity models," Transportation Research Part B: Methodological, Elsevier, vol. 17(1), pages 13-23, February.
    2. Garud N. Iyengar, 2005. "Robust Dynamic Programming," Mathematics of Operations Research, INFORMS, vol. 30(2), pages 257-280, May.
    3. Vinit Kumar Mishra & Karthik Natarajan & Dhanesh Padmanabhan & Chung-Piaw Teo & Xiaobo Li, 2014. "On Theoretical and Empirical Aspects of Marginal Distribution Choice Models," Management Science, INFORMS, vol. 60(6), pages 1511-1531, June.
    4. Karthik Natarajan & Miao Song & Chung-Piaw Teo, 2009. "Persistency Model and Its Applications in Choice Modeling," Management Science, INFORMS, vol. 55(3), pages 453-469, March.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Mai, Tien & Yu, Xinlian & Gao, Song & Frejinger, Emma, 2021. "Routing policy choice prediction in a stochastic network: Recursive model and solution algorithm," Transportation Research Part B: Methodological, Elsevier, vol. 151(C), pages 42-58.
    2. Siliang Zeng & Mingyi Hong & Alfredo Garcia, 2022. "Structural Estimation of Markov Decision Processes in High-Dimensional State Space with Finite-Time Guarantees," Papers 2210.01282, arXiv.org, revised Mar 2024.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Zhenzhen Yan & Karthik Natarajan & Chung Piaw Teo & Cong Cheng, 2022. "A Representative Consumer Model in Data-Driven Multiproduct Pricing Optimization," Management Science, INFORMS, vol. 68(8), pages 5798-5827, August.
    2. Aydın Alptekinoğlu & John H. Semple, 2016. "The Exponomial Choice Model: A New Alternative for Assortment and Price Optimization," Operations Research, INFORMS, vol. 64(1), pages 79-93, February.
    3. Wei Qi & Xinggang Luo & Xuwang Liu & Yang Yu & Zhongliang Zhang, 2019. "Product Line Pricing under Marginal Moment Model with Network Effect," Complexity, Hindawi, vol. 2019, pages 1-13, February.
    4. Guiyun Feng & Xiaobo Li & Zizhuo Wang, 2017. "Technical Note—On the Relation Between Several Discrete Choice Models," Operations Research, INFORMS, vol. 65(6), pages 1516-1525, December.
    5. David Muller & Emerson Melo & Ruben Schlotter, 2023. "A Distributionally Robust Random Utility Model," Papers 2303.05888, arXiv.org.
    6. Lin, Xiaogang & Zhou, Yong-Wu & Xie, Wei & Zhong, Yuanguang & Cao, Bin, 2020. "Pricing and Product-bundling Strategies for E-commerce Platforms with Competition," European Journal of Operational Research, Elsevier, vol. 283(3), pages 1026-1039.
    7. Qi Feng & J. George Shanthikumar & Mengying Xue, 2022. "Consumer Choice Models and Estimation: A Review and Extension," Production and Operations Management, Production and Operations Management Society, vol. 31(2), pages 847-867, February.
    8. Chikaraishi, Makoto & Nakayama, Shoichiro, 2016. "Discrete choice models with q-product random utilities," Transportation Research Part B: Methodological, Elsevier, vol. 93(PA), pages 576-595.
    9. Meng Qi & Ho‐Yin Mak & Zuo‐Jun Max Shen, 2020. "Data‐driven research in retail operations—A review," Naval Research Logistics (NRL), John Wiley & Sons, vol. 67(8), pages 595-616, December.
    10. Claudia Castaldi & Paolo Delle Site & Francesco Filippi, 2019. "Stochastic user equilibrium in the presence of state dependence," EURO Journal on Transportation and Logistics, Springer;EURO - The Association of European Operational Research Societies, vol. 8(5), pages 535-559, December.
    11. Damla Ahipaşaoğlu, Selin & Arıkan, Uğur & Natarajan, Karthik, 2016. "On the flexibility of using marginal distribution choice models in traffic equilibrium," Transportation Research Part B: Methodological, Elsevier, vol. 91(C), pages 130-158.
    12. Mengshi Lu & Zuo‐Jun Max Shen, 2021. "A Review of Robust Operations Management under Model Uncertainty," Production and Operations Management, Production and Operations Management Society, vol. 30(6), pages 1927-1943, June.
    13. Yi-Chun Chen & Velibor V. Mišić, 2022. "Decision Forest: A Nonparametric Approach to Modeling Irrational Choice," Management Science, INFORMS, vol. 68(10), pages 7090-7111, October.
    14. Juan Martín & Gustavo Nombela, 2007. "Microeconomic impacts of investments in high speed trains in Spain," The Annals of Regional Science, Springer;Western Regional Science Association, vol. 41(3), pages 715-733, September.
    15. Miren Lafourcade & Jacques-François Thisse, 2011. "New Economic Geography: The Role of Transport Costs," Chapters, in: André de Palma & Robin Lindsey & Emile Quinet & Roger Vickerman (ed.), A Handbook of Transport Economics, chapter 4, Edward Elgar Publishing.
    16. Eliasson, Jonas & Mattsson, Lars-Göran, 2000. "A model for integrated analysis of household location and travel choices," Transportation Research Part A: Policy and Practice, Elsevier, vol. 34(5), pages 375-394, June.
    17. Maximilian Blesch & Philipp Eisenhauer, 2021. "Robust decision-making under risk and ambiguity," Papers 2104.12573, arXiv.org, revised Oct 2021.
    18. Mercure, Jean-François, 2018. "Fashion, fads and the popularity of choices: Micro-foundations for diffusion consumer theory," Structural Change and Economic Dynamics, Elsevier, vol. 46(C), pages 194-207.
    19. Ubøe, Jan & Andersson, Jonas & Jörnsten, Kurt & Lillestøl, Jostein & Sandal, Leif, 2017. "Statistical testing of bounded rationality with applications to the newsvendor model," European Journal of Operational Research, Elsevier, vol. 259(1), pages 251-261.
    20. Anas, Alex & Chang, Huibin, 2023. "Productivity benefits of urban transportation megaprojects: A general equilibrium analysis of «Grand Paris Express»," Transportation Research Part B: Methodological, Elsevier, vol. 174(C).

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2008.07820. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.