IDEAS home Printed from https://ideas.repec.org/a/spr/annopr/v235y2015i1p103-12810.1007-s10479-015-1935-0.html
   My bibliography  Save this article

A leader–follower partially observed, multiobjective Markov game

Author

Listed:
  • Yanling Chang
  • Alan Erera
  • Chelsea White

Abstract

The intent of this research is to generate a set of non-dominated finite-memory policies from which one of two agents (the leader) can select a most preferred policy to control a dynamic system that is also affected by the control decisions of the other agent (the follower). The problem is described by an infinite horizon total discounted reward, partially observed Markov game (POMG). For each candidate finite-memory leader policy, we assume the follower, fully aware of the leader policy, determines a (perfect memory) policy that optimizes the follower’s (scalar) criterion. The leader–follower assumption allows the POMG to be transformed into a specially structured, partially observed Markov decision process that we use to determine the follower’s best response policy for a given leader policy. We then approximate the follower’s policy by a finite-memory policy. Each agent’s policy assumes that the agent knows its current and recent state values, its recent actions, and the current and recent possibly inaccurate observations of the other agent’s state. For each leader/follower policy pair, we determine the values of the leader’s criteria. We use a multi-objective genetic algorithm to create the next generation of leader policies based on the values of the leader criteria for each leader/follower policy pair in the current generation. Based on this information for the final generation of policies, we determine the set of non-dominated leader policies. We present an example that illustrates how these results can be used to support a manager of a liquid egg production process (the leader) in selecting a sequence of actions to maximize expected process productivity while mitigating the risk due to an attacker (the follower) who seeks to contaminate the process with a chemical or biological toxin. Copyright Springer Science+Business Media New York 2015

Suggested Citation

  • Yanling Chang & Alan Erera & Chelsea White, 2015. "A leader–follower partially observed, multiobjective Markov game," Annals of Operations Research, Springer, vol. 235(1), pages 103-128, December.
  • Handle: RePEc:spr:annopr:v:235:y:2015:i:1:p:103-128:10.1007/s10479-015-1935-0
    DOI: 10.1007/s10479-015-1935-0
    as

    Download full text from publisher

    File URL: http://hdl.handle.net/10.1007/s10479-015-1935-0
    Download Restriction: Access to full text is restricted to subscribers.

    File URL: https://libkey.io/10.1007/s10479-015-1935-0?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Vicki Bier & Santiago Oliveros & Larry Samuelson, 2007. "Choosing What to Protect: Strategic Defensive Allocation against an Unknown Attacker," Journal of Public Economic Theory, Association for Public Economic Theory, vol. 9(4), pages 563-587, August.
    2. Chen Wang & Vicki M. Bier, 2011. "Target-Hardening Decisions Based on Uncertain Multiattribute Terrorist Utility," Decision Analysis, INFORMS, vol. 8(4), pages 286-302, December.
    3. Vicki M. Bier & Naraphorn Haphuriwat & Jaime Menoyo & Rae Zimmerman & Alison M. Culpen, 2008. "Optimal Resource Allocation for Defense of Targets Based on Differing Measures of Attractiveness," Risk Analysis, John Wiley & Sons, vol. 28(3), pages 763-770, June.
    4. Hao Zhang, 2010. "Partially Observable Markov Decision Processes: A Geometric Technique and Analysis," Operations Research, INFORMS, vol. 58(1), pages 214-228, February.
    5. Casey Rothschild & Laura McLay & Seth Guikema, 2012. "Adversarial Risk Analysis with Incomplete Information: A Level‐k Approach," Risk Analysis, John Wiley & Sons, vol. 32(7), pages 1219-1231, July.
    6. White, Chelsea C. & White, Douglas J., 1989. "Markov decision processes," European Journal of Operational Research, Elsevier, vol. 39(1), pages 1-16, March.
    7. Jun Zhuang & Vicki M. Bier, 2007. "Balancing Terrorism and Natural Disasters---Defensive Strategy with Endogenous Attacker Effort," Operations Research, INFORMS, vol. 55(5), pages 976-991, October.
    8. Laura McLay & Casey Rothschild & Seth Guikema, 2012. "Robust Adversarial Risk Analysis: A Level- k Approach," Decision Analysis, INFORMS, vol. 9(1), pages 41-54, March.
    9. K Deb, 2001. "Nonlinear goal programming using multi-objective genetic algorithms," Journal of the Operational Research Society, Palgrave Macmillan;The OR Society, vol. 52(3), pages 291-302, March.
    10. Richard D. Smallwood & Edward J. Sondik, 1973. "The Optimal Control of Partially Observable Markov Processes over a Finite Horizon," Operations Research, INFORMS, vol. 21(5), pages 1071-1088, October.
    11. Chelsea C. White & William T. Scherer, 1989. "Solution Procedures for Partially Observed Markov Decision Processes," Operations Research, INFORMS, vol. 37(5), pages 791-797, October.
    12. Niyazi Bakır, 2011. "A Stackelberg game model for resource allocation in cargo container security," Annals of Operations Research, Springer, vol. 187(1), pages 5-22, July.
    13. Daniel S. Bernstein & Robert Givan & Neil Immerman & Shlomo Zilberstein, 2002. "The Complexity of Decentralized Control of Markov Decision Processes," Mathematics of Operations Research, INFORMS, vol. 27(4), pages 819-840, November.
    14. Yanling Chang & Alan Erera & Chelsea White, 2015. "Value of information for a leader–follower partially observed Markov game," Annals of Operations Research, Springer, vol. 235(1), pages 129-153, December.
    15. Konak, Abdullah & Coit, David W. & Smith, Alice E., 2006. "Multi-objective optimization using genetic algorithms: A tutorial," Reliability Engineering and System Safety, Elsevier, vol. 91(9), pages 992-1007.
    16. M. K. Ghosh & D. McDonald & S. Sinha, 2004. "Zero-Sum Stochastic Games with Partial Information," Journal of Optimization Theory and Applications, Springer, vol. 121(1), pages 99-118, April.
    17. Keeney,Ralph L. & Raiffa,Howard, 1993. "Decisions with Multiple Objectives," Cambridge Books, Cambridge University Press, number 9780521438834.
    18. Huseyin Cavusoglu & Young Kwark & Bin Mai & Srinivasan Raghunathan, 2013. "Passenger Profiling and Screening for Aviation Security in the Presence of Strategic Attackers," Decision Analysis, INFORMS, vol. 10(1), pages 63-81, March.
    19. George E. Monahan, 1982. "State of the Art---A Survey of Partially Observable Markov Decision Processes: Theory, Models, and Algorithms," Management Science, INFORMS, vol. 28(1), pages 1-16, January.
    20. Edward J. Sondik, 1978. "The Optimal Control of Partially Observable Markov Processes over the Infinite Horizon: Discounted Costs," Operations Research, INFORMS, vol. 26(2), pages 282-304, April.
    21. Kjell Hausken & Jun Zhuang, 2011. "Governments' and Terrorists' Defense and Attack in a T -Period Game," Decision Analysis, INFORMS, vol. 8(1), pages 46-70, March.
    22. K Hausken & J Zhuang, 2012. "The timing and deterrence of terrorist attacks due to exogenous dynamics," Journal of the Operational Research Society, Palgrave Macmillan;The OR Society, vol. 63(6), pages 726-735, June.
    23. James N. Eagle, 1984. "The Optimal Search for a Moving Target When the Search Path Is Constrained," Operations Research, INFORMS, vol. 32(5), pages 1107-1115, October.
    24. Hamid Mohtadi & Antu Panini Murshid, 2009. "Risk Analysis of Chemical, Biological, or Radionuclear Threats: Implications for Food Security," Risk Analysis, John Wiley & Sons, vol. 29(9), pages 1317-1335, September.
    25. James C. Bean, 1994. "Genetic Algorithms and Random Keys for Sequencing and Optimization," INFORMS Journal on Computing, INFORMS, vol. 6(2), pages 154-160, May.
    26. Zong-Zhi Lin & James C. Bean & Chelsea C. White, 2004. "A Hybrid Genetic/Optimization Algorithm for Finite-Horizon, Partially Observed Markov Decision Processes," INFORMS Journal on Computing, INFORMS, vol. 16(1), pages 27-38, February.
    27. Chelsea C. White & William T. Scherer, 1994. "Finite-Memory Suboptimal Design for Partially Observed Markov Decision Processes," Operations Research, INFORMS, vol. 42(3), pages 439-455, June.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Satya S. Malladi & Alan L. Erera & Chelsea C. White, 2023. "Inventory control with modulated demand and a partially observed modulation process," Annals of Operations Research, Springer, vol. 321(1), pages 343-369, February.
    2. Denizalp Goktas & Jiayi Zhao & Amy Greenwald, 2022. "Zero-Sum Stochastic Stackelberg Games," Papers 2211.13847, arXiv.org.
    3. Julio B. Clempner, 2018. "Computing multiobjective Markov chains handled by the extraproximal method," Annals of Operations Research, Springer, vol. 271(2), pages 469-486, December.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Yanling Chang & Alan Erera & Chelsea White, 2015. "Value of information for a leader–follower partially observed Markov game," Annals of Operations Research, Springer, vol. 235(1), pages 129-153, December.
    2. Mohammad E. Nikoofal & Mehmet Gümüs, 2015. "On the value of terrorist’s private information in a government’s defensive resource allocation problem," IISE Transactions, Taylor & Francis Journals, vol. 47(6), pages 533-555, June.
    3. Hao Zhang, 2010. "Partially Observable Markov Decision Processes: A Geometric Technique and Analysis," Operations Research, INFORMS, vol. 58(1), pages 214-228, February.
    4. Xiaojun (Gene) Shan & Jun Zhuang, 2014. "Modeling Credible Retaliation Threats in Deterring the Smuggling of Nuclear Weapons Using Partial Inspection---A Three-Stage Game," Decision Analysis, INFORMS, vol. 11(1), pages 43-62, March.
    5. Zong-Zhi Lin & James C. Bean & Chelsea C. White, 2004. "A Hybrid Genetic/Optimization Algorithm for Finite-Horizon, Partially Observed Markov Decision Processes," INFORMS Journal on Computing, INFORMS, vol. 16(1), pages 27-38, February.
    6. Abhijit Gosavi, 2009. "Reinforcement Learning: A Tutorial Survey and Recent Advances," INFORMS Journal on Computing, INFORMS, vol. 21(2), pages 178-192, May.
    7. Zhiheng Xu & Jun Zhuang, 2019. "A Study on a Sequential One‐Defender‐N‐Attacker Game," Risk Analysis, John Wiley & Sons, vol. 39(6), pages 1414-1432, June.
    8. Peiqiu Guan & Jun Zhuang, 2016. "Modeling Resources Allocation in Attacker‐Defender Games with “Warm Up” CSF," Risk Analysis, John Wiley & Sons, vol. 36(4), pages 776-791, April.
    9. Vineet M. Payyappalli & Jun Zhuang & Victor Richmond R. Jose, 2017. "Deterrence and Risk Preferences in Sequential Attacker–Defender Games with Continuous Efforts," Risk Analysis, John Wiley & Sons, vol. 37(11), pages 2229-2245, November.
    10. Xing Gao & Weijun Zhong & Shue Mei, 2013. "Information Security Investment When Hackers Disseminate Knowledge," Decision Analysis, INFORMS, vol. 10(4), pages 352-368, December.
    11. Shan, Xiaojun & Zhuang, Jun, 2018. "Modeling cumulative defensive resource allocation against a strategic attacker in a multi-period multi-target sequential game," Reliability Engineering and System Safety, Elsevier, vol. 179(C), pages 12-26.
    12. James T. Treharne & Charles R. Sox, 2002. "Adaptive Inventory Control for Nonstationary Demand and Partial Information," Management Science, INFORMS, vol. 48(5), pages 607-624, May.
    13. Szidarovszky, Ferenc & Luo, Yi, 2014. "Incorporating risk seeking attitude into defense strategy," Reliability Engineering and System Safety, Elsevier, vol. 123(C), pages 104-109.
    14. Chernonog, Tatyana & Avinadav, Tal, 2016. "A two-state partially observable Markov decision process with three actionsAuthor-Name: Ben-Zvi, Tal," European Journal of Operational Research, Elsevier, vol. 254(3), pages 957-967.
    15. Hunt, Kyle & Agarwal, Puneet & Zhuang, Jun, 2022. "On the adoption of new technology to enhance counterterrorism measures: An attacker–defender game with risk preferences," Reliability Engineering and System Safety, Elsevier, vol. 218(PB).
    16. Wei Wang & Francesco Di Maio & Enrico Zio, 2019. "Adversarial Risk Analysis to Allocate Optimal Defense Resources for Protecting Cyber–Physical Systems from Cyber Attacks," Risk Analysis, John Wiley & Sons, vol. 39(12), pages 2766-2785, December.
    17. Abdolmajid Yolmeh & Melike Baykal-Gürsoy, 2019. "Two-Stage Invest–Defend Game: Balancing Strategic and Operational Decisions," Decision Analysis, INFORMS, vol. 16(1), pages 46-66, March.
    18. Serin, Yasemin, 1995. "A nonlinear programming model for partially observable Markov decision processes: Finite horizon case," European Journal of Operational Research, Elsevier, vol. 86(3), pages 549-564, November.
    19. Simon, Jay & Omar, Ayman, 2020. "Cybersecurity investments in the supply chain: Coordination and a strategic attacker," European Journal of Operational Research, Elsevier, vol. 282(1), pages 161-171.
    20. Yossi Aviv & Amit Pazgal, 2005. "A Partially Observed Markov Decision Process for Dynamic Pricing," Management Science, INFORMS, vol. 51(9), pages 1400-1416, September.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:annopr:v:235:y:2015:i:1:p:103-128:10.1007/s10479-015-1935-0. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.