Using Non-Stationary Bandits for Learning in Repeated Cournot Games with Non-Stationary Demand

My bibliography Save this paper

Using Non-Stationary Bandits for Learning in Repeated Cournot Games with Non-Stationary Demand

Author

Listed:

Kshitija Taywade
Brent Harrison
Judy Goldsmith

Registered:

Abstract

Many past attempts at modeling repeated Cournot games assume that demand is stationary. This does not align with real-world scenarios in which market demands can evolve over a product's lifetime for a myriad of reasons. In this paper, we model repeated Cournot games with non-stationary demand such that firms/agents face separate instances of non-stationary multi-armed bandit problem. The set of arms/actions that an agent can choose from represents discrete production quantities; here, the action space is ordered. Agents are independent and autonomous, and cannot observe anything from the environment; they can only see their own rewards after taking an action, and only work towards maximizing these rewards. We propose a novel algorithm 'Adaptive with Weighted Exploration (AWE) $\epsilon$-greedy' which is remotely based on the well-known $\epsilon$-greedy approach. This algorithm detects and quantifies changes in rewards due to varying market demand and varies learning rate and exploration rate in proportion to the degree of changes in demand, thus enabling agents to better identify new optimal actions. For efficient exploration, it also deploys a mechanism for weighing actions that takes advantage of the ordered action space. We use simulations to study the emergence of various equilibria in the market. In addition, we study the scalability of our approach in terms number of total agents in the system and the size of action space. We consider both symmetric and asymmetric firms in our models. We found that using our proposed method, agents are able to swiftly change their course of action according to the changes in demand, and they also engage in collusive behavior in many simulations.

Suggested Citation

Kshitija Taywade & Brent Harrison & Judy Goldsmith, 2022. "Using Non-Stationary Bandits for Learning in Repeated Cournot Games with Non-Stationary Demand," Papers 2201.00486, arXiv.org.

Handle: RePEc:arx:papers:2201.00486

Download full text from publisher

References listed on IDEAS

Fudenberg, Drew & Levine, David, 1998. "Learning in games," European Economic Review, Elsevier, vol. 42(3-5), pages 631-639, May.
- Drew Fudenberg & David K. Levine, 1998. "Learning in Games," Levine's Working Paper Archive 2222, David K. Levine.
Kostas Bimpikis & Shayan Ehsani & Rahmi İlkılıç, 2019. "Cournot Competition in Networked Markets," Management Science, INFORMS, vol. 67(6), pages 2467-2481, June.
Yue, Jinfeng & Xia, Yu & Tran, Thuhang, 2010. "Selecting sourcing partners for a make-to-order supply chain," Omega, Elsevier, vol. 38(3-4), pages 136-144, June.
Vriend, Nicolaas J., 2000. "An illustration of the essential difference between individual and social learning, and its consequences for computational analyses," Journal of Economic Dynamics and Control, Elsevier, vol. 24(1), pages 1-19, January.
- Nicolaas J. Vriend, 1998. "An Illustration of the Essential Difference between Individual and Social Learning, and its Consequences for Computational Analyses," Working Papers 387, Queen Mary University of London, School of Economics and Finance.
Kanishka Misra & Eric M. Schwartz & Jacob Abernethy, 2019. "Dynamic Online Pricing with Incomplete Information Using Multiarmed Bandit Experiments," Marketing Science, INFORMS, vol. 38(2), pages 226-252, March.
Waltman, Ludo & Kaymak, Uzay, 2008. "Q-learning agents in a Cournot oligopoly model," Journal of Economic Dynamics and Control, Elsevier, vol. 32(10), pages 3275-3293, October.
Davide Radi, 2017. "Walrasian versus Cournot behavior in an oligopoly of boundedly rational firms," Journal of Evolutionary Economics, Springer, vol. 27(5), pages 933-961, November.
Stephen C. Graves & Sean P. Willems, 2008. "Strategic Inventory Placement in Supply Chains: Nonstationary Demand," Manufacturing & Service Operations Management, INFORMS, vol. 10(2), pages 278-287, March.
Fernando Vega-Redondo, 1997. "The Evolution of Walrasian Behavior," Econometrica, Econometric Society, vol. 65(2), pages 375-384, March.
- Fernando Vega Redondo, 1996. "The evolution of walrasian behavior," Working Papers. Serie AD 1996-05, Instituto Valenciano de Investigaciones Económicas, S.A. (Ivie).
Jasmina Arifovic & Michael Maschek, 2006. "Revisiting Individual Evolutionary Learning in the Cobweb Model – An Illustration of the Virtual Spite-Effect," Computational Economics, Springer;Society for Computational Economics, vol. 28(4), pages 333-354, November.
Tunc, Huseyin & Kilic, Onur A. & Tarim, S. Armagan & Eksioglu, Burak, 2011. "The cost of using stationary inventory policies when demand is non-stationary," Omega, Elsevier, vol. 39(4), pages 410-415, August.
Drew Fudenberg & David K. Levine, 1998. "The Theory of Learning in Games," MIT Press Books, The MIT Press, edition 1, volume 1, number 0262061945, December.
- Drew Fudenberg & David K. Levine, 1996. "The Theory of Learning in Games," Levine's Working Paper Archive 624, David K. Levine.
Bischi, Gian Italo & Lamantia, Fabio & Radi, Davide, 2015. "An evolutionary Cournot model with limited market knowledge," Journal of Economic Behavior & Organization, Elsevier, vol. 116(C), pages 219-238.
Thomas Riechmann, 2006. "Cournot or Walras? Long-Run Results in Oligopoly Games," Journal of Institutional and Theoretical Economics (JITE), Mohr Siebeck, Tübingen, vol. 162(4), pages 702-720, December.
Steffen Huck & Hans-Theo Normann & Joerg Oechssler, 2004. "Through Trial and Error to Collusion," International Economic Review, Department of Economics, University of Pennsylvania and Osaka University Institute of Social and Economic Research Association, vol. 45(1), pages 205-224, February.
Pai, Mallesh & Hansen, Karsten, 2020. "Algorithmic Collusion: Supra-competitive Prices via Independent Algorithms," CEPR Discussion Papers 14372, C.E.P.R. Discussion Papers.

Full references (including those not matched with items on IDEAS)

Most related items

These are the items that most often cite the same works as this one and are cited by the same works as this one.

Kshitija Taywade & Brent Harrison & Adib Bagh, 2022. "Modelling Cournot Games as Multi-agent Multi-armed Bandits," Papers 2201.01182, arXiv.org.
Alós-Ferrer, Carlos & Buckenmaier, Johannes, 2017. "Cournot vs. Walras: A reappraisal through simulations," Journal of Economic Dynamics and Control, Elsevier, vol. 82(C), pages 257-272.
Junyi Xu, 2021. "Reinforcement Learning in a Cournot Oligopoly Model," Computational Economics, Springer;Society for Computational Economics, vol. 58(4), pages 1001-1024, December.
Anufriev, Mikhail & Kopányi, Dávid, 2018. "Oligopoly game: Price makers meet price takers," Journal of Economic Dynamics and Control, Elsevier, vol. 91(C), pages 84-103.
repec:ebl:ecbull:v:4:y:2006:i:29:p:1-8 is not listed on IDEAS
Waltman, Ludo & Kaymak, Uzay, 2008. "Q-learning agents in a Cournot oligopoly model," Journal of Economic Dynamics and Control, Elsevier, vol. 32(10), pages 3275-3293, October.
Thomas Riechmann, 2006. "Mixed motives in a Cournot game," Economics Bulletin, AccessEcon, vol. 4(29), pages 1-8.
Andreas Nicklisch, 2011. "Learning strategic environments: an experimental study of strategy formation and transfer," Theory and Decision, Springer, vol. 71(4), pages 539-558, October.
Arthur Charpentier & Romuald Élie & Carl Remlinger, 2023. "Reinforcement Learning in Economics and Finance," Computational Economics, Springer;Society for Computational Economics, vol. 62(1), pages 425-462, June.
Arifovic, Jasmina & Karaivanov, Alexander, 2010. "Learning by doing vs. learning from others in a principal-agent model," Journal of Economic Dynamics and Control, Elsevier, vol. 34(10), pages 1967-1992, October.
- Jasmina Arifovic & Alexander Karaivanov, 2007. "Learning by Doing vs. Learning from Others in a Principal-Agent Model," Discussion Papers dp07-24, Department of Economics, Simon Fraser University.
Gian Italo Bischi & Fabio Lamantia & Davide Radi, 2018. "Evolutionary oligopoly games with heterogeneous adaptive players," Chapters, in: Luis C. Corchón & Marco A. Marini (ed.), Handbook of Game Theory and Industrial Organization, Volume I, chapter 12, pages 343-370, Edward Elgar Publishing.
Floortje Alkemade & Han Poutré & Hans Amman, 2006. "Robust Evolutionary Algorithm Design for Socio-economic Simulation," Computational Economics, Springer;Society for Computational Economics, vol. 28(4), pages 355-370, November.
Vallée, Thomas & YIldIzoglu, Murat, 2009. "Convergence in the finite Cournot oligopoly with social and individual learning," Journal of Economic Behavior & Organization, Elsevier, vol. 72(2), pages 670-690, November.
- Thomas Vallée & Murat Yildizoglu, 2007. "Convergence in Finite Cournot Oligopoly with Social and Individual Learning," Post-Print hal-00293948, HAL.
- Thomas Vallée & Murat Yildizoglu, 2009. "Convergence in the Finite Cournot Oligopoly with Social and Individual Learning," Working Papers halshs-00368274, HAL.
- Thomas Vallée & Murat Yildizoğlu, 2009. "Convergence in the finite Cournot oligopoly with social and individual learning," Post-Print hal-00722790, HAL.
- Thomas VALLEE & Murat YILDIZOGLU, 2007. "Convergence in Finite Cournot Oligopoly with Social and Individual Learning," Cahiers du GREThA (2007-2019) 2007-07, Groupe de Recherche en Economie Théorique et Appliquée (GREThA).
- Murat Yildizoglu & Thomas Vallée, 2007. "Convergence in Finite Cournot Oligopoly with Social and Individual Learning," Post-Print hal-00394413, HAL.
- Thomas Vallée & Murat Yildizoglu, 2007. "Convergence in Finite Cournot Oligopoly with Social and Individual Learning," Post-Print hal-00293929, HAL.
Peter Duersch & Albert Kolb & Jörg Oechssler & Burkhard Schipper, 2010. "Rage against the machines: how subjects play against learning algorithms," Economic Theory, Springer;Society for the Advancement of Economic Theory (SAET), vol. 43(3), pages 407-430, June.
Bergin, James & Bernhardt, Dan, 2009. "Cooperation through imitation," Games and Economic Behavior, Elsevier, vol. 67(2), pages 376-388, November.
- James Bergin & Dan Bernhardt, 2006. "Cooperation Through Imitation," Working Paper 1042, Economics Department, Queen's University.
Ludo Waltman & Nees Eck & Rommert Dekker & Uzay Kaymak, 2011. "Economic modeling using evolutionary algorithms: the effect of a binary encoding of strategies," Journal of Evolutionary Economics, Springer, vol. 21(5), pages 737-756, December.
- Waltman, L. & van Eck, N.J.P. & Dekker, R. & Kaymak, U., 2009. "Economic Modeling Using Evolutionary Algorithms: The Effect of a Binary Encoding of Strategies," ERIM Report Series Research in Management ERS-2009-028-LIS, Erasmus Research Institute of Management (ERIM), ERIM is the joint research institute of the Rotterdam School of Management, Erasmus University and the Erasmus School of Economics (ESE) at Erasmus University Rotterdam.
Arthur Charpentier & Romuald Elie & Carl Remlinger, 2020. "Reinforcement Learning in Economics and Finance," Papers 2003.10014, arXiv.org.
Alós-Ferrer, Carlos & Ritschel, Alexander, 2021. "Multiple behavioral rules in Cournot oligopolies," Journal of Economic Behavior & Organization, Elsevier, vol. 183(C), pages 250-267.
- Carlos Alós-Ferrer & Alexander Ritschel, 2019. "Multiple behavioral rules in Cournot oligopolies," ECON - Working Papers 331, Department of Economics - University of Zurich, revised Jul 2020.
Waltman, L. & van Eck, N.J.P., 2009. "A Mathematical Analysis of the Long-run Behavior of Genetic Algorithms for Social Modeling," ERIM Report Series Research in Management ERS-2009-011-LIS, Erasmus Research Institute of Management (ERIM), ERIM is the joint research institute of the Rotterdam School of Management, Erasmus University and the Erasmus School of Economics (ESE) at Erasmus University Rotterdam.
Gian Italo Bischi & Fabio Lamantia, 2022. "Evolutionary oligopoly games with cooperative and aggressive behaviors," Journal of Economic Interaction and Coordination, Springer;Society for Economic Science with Heterogeneous Interacting Agents, vol. 17(1), pages 3-27, January.
Davide Radi, 2017. "Walrasian versus Cournot behavior in an oligopoly of boundedly rational firms," Journal of Evolutionary Economics, Springer, vol. 27(5), pages 933-961, November.

More about this item

NEP fields

This paper has been announced in the following NEP Reports:

NEP-CMP-2022-02-07 (Computational Economics)
NEP-GTH-2022-02-07 (Game Theory)

Statistics

Access and download statistics

Corrections

All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2201.00486. See general information about how to correct material in RePEc.

If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

Please note that corrections may take a couple of weeks to filter through the various RePEc services.

IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.

Browse Econ Literature

More features

Using Non-Stationary Bandits for Learning in Repeated Cournot Games with Non-Stationary Demand

Author

Abstract

Suggested Citation

Download full text from publisher

References listed on IDEAS

Most related items

More about this item

NEP fields

Statistics

Corrections

More services and features

MyIDEAS

Author registration

Rankings

RePEc Genealogy

RePEc Biblio

MPRA

New papers by email

EconAcademics

Plagiarism

About RePEc

RePEc home

Blog

Help/FAQ

RePEc team

Participating archives

Privacy statement

Help us

Corrections

Volunteers

Get papers listed

Open a RePEc archive

Get RePEc data