A holistic matrix norm-based alternative solution method for Markov reward games

My bibliography Save this article

A holistic matrix norm-based alternative solution method for Markov reward games

Author

Listed:

İzgi, Burhaneddin
Özkaya, Murat
Kemal Üre, Nazım
Perc, Matjaž

Registered:

Abstract

In this study, we focus on examining single-agent stochastic games, especially Markov reward games represented in the form of a decision tree. We propose an alternative solution method based on the matrix norms for these games. In contrast to the existing methods such as value iteration, policy iteration, and dynamic programming, which are state-and-action-based approaches, the proposed matrix norm-based method considers the relevant stages and their actions as a whole and solves it holistically for each stage without computing the effects of each action on each state's reward individually. The new method involves a distinct transformation of the decision tree into a payoff matrix for each stage and the utilization of the matrix norm of the obtained payoff matrix. Additionally, the concept of the moving matrix is integrated into the proposed method to incorporate the impacts of all actions on the stage simultaneously, rendering the method holistic. Moreover, we present an explanatory algorithm for the implementation of the method and also provide a comprehensive solution diagram explaining the method figuratively. As a result, we offer a new and alternative perspective for solving the games with the help of the proposed method due to the simplicity of utilization of the matrix norms in addition to the existing methods. For clarification of the matrix norm-based method, we demonstrate the figurative application of the method on a benchmark Markov reward game with 2-stages and 2-actions and a comprehensive implementation of the method on a game consisting of 3-stages and 3-actions.

Suggested Citation

İzgi, Burhaneddin & Özkaya, Murat & Kemal Üre, Nazım & Perc, Matjaž, 2025. "A holistic matrix norm-based alternative solution method for Markov reward games," Applied Mathematics and Computation, Elsevier, vol. 488(C).

Handle: RePEc:eee:apmaco:v:488:y:2025:i:c:s009630032400585x
DOI: 10.1016/j.amc.2024.129124

Download full text from publisher

As the access to this document is restricted, you may want to

for a different version of it.

References listed on IDEAS

Martin Shubik, 1955. "The Uses of Game Theory in Management Science," Management Science, INFORMS, vol. 2(1), pages 40-54, October.
Jia Yuan Yu & Shie Mannor & Nahum Shimkin, 2009. "Markov Decision Processes with Arbitrary Reward Processes," Mathematics of Operations Research, INFORMS, vol. 34(3), pages 737-757, August.
Ji Ang & David Levinson, 2020. "A Review of Game Theory Models of Lane Changing," Working Papers 2022-01, University of Minnesota: Nexus Research Group.
Fredrik Stenberg & Raimondo Manca & Dmitrii Silvestrov, 2007. "An Algorithmic Approach to Discrete Time Non-homogeneous Backward Semi-Markov Reward Processes with an Application to Disability Insurance," Methodology and Computing in Applied Probability, Springer, vol. 9(4), pages 497-519, December.
Kreps, David M., 1990. "Game Theory and Economic Modelling," OUP Catalogue, Oxford University Press, number 9780198283812.
İzgi, Burhaneddin & Özkaya, Murat & Üre, Nazım Kemal & Perc, Matjaž, 2023. "Extended matrix norm method: Applications to bimatrix games and convergence results," Applied Mathematics and Computation, Elsevier, vol. 438(C).
Yi, Yanqing & Wang, Xikui, 2023. "A Markov decision process for response adaptive designs," Econometrics and Statistics, Elsevier, vol. 25(C), pages 125-133.

Full references (including those not matched with items on IDEAS)

Citations

Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.

Cited by:

Li, Dandan & Wu, Qiongzi & Han, Dun, 2025. "On evolution of agent behavior under limited gaming time with reinforcement learning," Chaos, Solitons & Fractals, Elsevier, vol. 194(C).

Most related items

These are the items that most often cite the same works as this one and are cited by the same works as this one.

Jarratt, Denise & Ceric, Arnela, 2015. "The complexity of trust in business collaborations," Australasian marketing journal, Elsevier, vol. 23(1), pages 2-12.
Send, Jonas & Serena, Marco, 2022. "An empirical analysis of insistent bargaining," Journal of Economic Psychology, Elsevier, vol. 90(C).
van Damme, E.E.C., 1995. "Game theory : The next stage," Other publications TiSEM 7779b0f9-bef5-45c7-ae6b-7, Tilburg University, School of Economics and Management.
- van Damme, E.E.C., 1999. "Game theory : The next stage," Other publications TiSEM 9b1f2bbf-2e19-42e7-894a-4, Tilburg University, School of Economics and Management.
- van Damme, E.E.C., 1995. "Game theory : The next stage," Discussion Paper 1995-73, Tilburg University, Center for Economic Research.
Christian Koboldt, 1996. "Consistent planning, backwards induction, and rule-governed behavior," Constitutional Political Economy, Springer, vol. 7(1), pages 35-48, March.
Hutton, Trevor & Sumaila, Ussif Rashid, 2002. "Natural Resource Accounting And South African Fisheries: A Bio-Economic Assessment Of The West Coast Deep-Sea Hake Fishery With Reference To The Optimal Utilisation And Management Of The Resource," Discussion Papers 18018, University of Pretoria, Center for Environmental Economics and Policy in Africa.
Killian J. McCarthy & Frederik van Doorn & Brigitte Unger, 2011. "Tax Competition and the Harmonisation of Corporate Tax Rates in Europe," Chapters, in: Miroslav N. Jovanović (ed.), International Handbook on the Economics of Integration, Volume II, chapter 20, Edward Elgar Publishing.
Plan, Asaf, 2023. "Symmetry in n-player games," Journal of Economic Theory, Elsevier, vol. 207(C).
İzgi, Burhaneddin & Özkaya, Murat & Üre, Nazım Kemal & Perc, Matjaž, 2024. "Matrix norm based hybrid Shapley and iterative methods for the solution of stochastic matrix games," Applied Mathematics and Computation, Elsevier, vol. 473(C).
Roberto Weber & Colin Camerer & Marc Knez, 2004. "Timing and Virtual Observability in Ultimatum Bargaining and “Weak Link” Coordination Games," Experimental Economics, Springer;Economic Science Association, vol. 7(1), pages 25-48, February.
- Camerer, Colin F. & Knez, Marc & Weber, Roberto A., 1996. "Timing and Virtual Observability in Ultimatum Bargaining and Weak Link Coordination Games," Working Papers 970, California Institute of Technology, Division of the Humanities and Social Sciences.
Jonas Send & Marco Serena, 2021. "An Empirical Analysis of Stubborn Bargaining," Working Papers tax-mpg-rps-2021-05, Max Planck Institute for Tax Law and Public Finance.
Lawrence Boland, 2002. "Towards a useful methodology discipline," Journal of Economic Methodology, Taylor & Francis Journals, vol. 8(1), pages 3-10.
Timo Goeschl & Daniel Heyen & Juan Moreno-Cruz, 2013. "The Intergenerational Transfer of Solar Radiation Management Capabilities and Atmospheric Carbon Stocks," Environmental & Resource Economics, Springer;European Association of Environmental and Resource Economists, vol. 56(1), pages 85-104, September.
- Goeschl, Timo & Heyen, Daniel & Moreno-Cruz, Juan, 2013. "The Intergenerational Transfer of Solar Radiation Management Capabilities and Atmospheric Carbon Stocks," Working Papers 0540, University of Heidelberg, Department of Economics.
Henry Ergas, 2008. "Should Australia Encourage Developing Countries to Adopt Competition Laws?," Macroeconomics Working Papers 22307, East Asian Bureau of Economic Research.
Drew Fudenberg, 2006. "Advancing Beyond Advances in Behavioral Economics," Journal of Economic Literature, American Economic Association, vol. 44(3), pages 694-711, September.
- Fudenberg, Drew, 2006. "Advancing Beyond "Advances in Behavioral Economics"," Scholarly Articles 3208222, Harvard University Department of Economics.
Eyal Even-Dar & Sham. M. Kakade & Yishay Mansour, 2009. "Online Markov Decision Processes," Mathematics of Operations Research, INFORMS, vol. 34(3), pages 726-736, August.
Vincent Mangematin, 1998. "La confiance : un mode de coordination dont l'utilisation dépend de ses conditions de production," Post-Print hal-00424495, HAL.
Daphne A. Kenyon, 1997. "Theories of interjurisdictional competition," New England Economic Review, Federal Reserve Bank of Boston, issue Mar, pages 13-36.
Fudenberg, Drew & Kreps, David M., 1995. "Learning in extensive-form games I. Self-confirming equilibria," Games and Economic Behavior, Elsevier, vol. 8(1), pages 20-55.
Bram Cadsby, Charles & Maynes, Elizabeth, 1998. "Choosing between a socially efficient and free-riding equilibrium: Nurses versus economics and business students," Journal of Economic Behavior & Organization, Elsevier, vol. 37(2), pages 183-192, October.
Ludovic A. Julien & Fabrice Tricou, 2010. "Oligopoly Equilibria “à la Stackelberg” in Pure Exchange Economies," Recherches économiques de Louvain, De Boeck Université, vol. 76(2), pages 175-194.
- Ludovic Julien & Fabrice Tricou, 2010. "Oligopoly equilibria ‘à la Stackelberg’ in pure exchange economies," Post-Print halshs-01228027, HAL.
- Ludovic A. Julien & Fabrice Tricou, 2010. "Oligopoly Equilibria “à la Stackelberg” in Pure Exchange Economies," Discussion Papers (REL - Recherches Economiques de Louvain) 2010023, Université catholique de Louvain, Institut de Recherches Economiques et Sociales (IRES).

More about this item

Keywords

; ; ; ; ;

Statistics

Access and download statistics

Corrections

All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:apmaco:v:488:y:2025:i:c:s009630032400585x. See general information about how to correct material in RePEc.

If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: https://www.journals.elsevier.com/applied-mathematics-and-computation .

Please note that corrections may take a couple of weeks to filter through the various RePEc services.

IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.

Browse Econ Literature

More features

A holistic matrix norm-based alternative solution method for Markov reward games

Author

Abstract

Suggested Citation

Download full text from publisher

References listed on IDEAS

Citations

Most related items

More about this item

Keywords

Statistics

Corrections

More services and features

MyIDEAS

Author registration

Rankings

RePEc Genealogy

RePEc Biblio

MPRA

New papers by email

EconAcademics

Plagiarism

About RePEc

RePEc home

Blog

Help/FAQ

RePEc team

Participating archives

Privacy statement

Help us

Corrections

Volunteers

Get papers listed

Open a RePEc archive

Get RePEc data