IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2201.00486.html
   My bibliography  Save this paper

Using Non-Stationary Bandits for Learning in Repeated Cournot Games with Non-Stationary Demand

Author

Listed:
  • Kshitija Taywade
  • Brent Harrison
  • Judy Goldsmith

Abstract

Many past attempts at modeling repeated Cournot games assume that demand is stationary. This does not align with real-world scenarios in which market demands can evolve over a product's lifetime for a myriad of reasons. In this paper, we model repeated Cournot games with non-stationary demand such that firms/agents face separate instances of non-stationary multi-armed bandit problem. The set of arms/actions that an agent can choose from represents discrete production quantities; here, the action space is ordered. Agents are independent and autonomous, and cannot observe anything from the environment; they can only see their own rewards after taking an action, and only work towards maximizing these rewards. We propose a novel algorithm 'Adaptive with Weighted Exploration (AWE) $\epsilon$-greedy' which is remotely based on the well-known $\epsilon$-greedy approach. This algorithm detects and quantifies changes in rewards due to varying market demand and varies learning rate and exploration rate in proportion to the degree of changes in demand, thus enabling agents to better identify new optimal actions. For efficient exploration, it also deploys a mechanism for weighing actions that takes advantage of the ordered action space. We use simulations to study the emergence of various equilibria in the market. In addition, we study the scalability of our approach in terms number of total agents in the system and the size of action space. We consider both symmetric and asymmetric firms in our models. We found that using our proposed method, agents are able to swiftly change their course of action according to the changes in demand, and they also engage in collusive behavior in many simulations.

Suggested Citation

  • Kshitija Taywade & Brent Harrison & Judy Goldsmith, 2022. "Using Non-Stationary Bandits for Learning in Repeated Cournot Games with Non-Stationary Demand," Papers 2201.00486, arXiv.org.
  • Handle: RePEc:arx:papers:2201.00486
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2201.00486
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Fudenberg, Drew & Levine, David, 1998. "Learning in games," European Economic Review, Elsevier, vol. 42(3-5), pages 631-639, May.
    2. Kostas Bimpikis & Shayan Ehsani & Rahmi İlkılıç, 2019. "Cournot Competition in Networked Markets," Management Science, INFORMS, vol. 67(6), pages 2467-2481, June.
    3. Yue, Jinfeng & Xia, Yu & Tran, Thuhang, 2010. "Selecting sourcing partners for a make-to-order supply chain," Omega, Elsevier, vol. 38(3-4), pages 136-144, June.
    4. Vriend, Nicolaas J., 2000. "An illustration of the essential difference between individual and social learning, and its consequences for computational analyses," Journal of Economic Dynamics and Control, Elsevier, vol. 24(1), pages 1-19, January.
    5. Kanishka Misra & Eric M. Schwartz & Jacob Abernethy, 2019. "Dynamic Online Pricing with Incomplete Information Using Multiarmed Bandit Experiments," Marketing Science, INFORMS, vol. 38(2), pages 226-252, March.
    6. Waltman, Ludo & Kaymak, Uzay, 2008. "Q-learning agents in a Cournot oligopoly model," Journal of Economic Dynamics and Control, Elsevier, vol. 32(10), pages 3275-3293, October.
    7. Davide Radi, 2017. "Walrasian versus Cournot behavior in an oligopoly of boundedly rational firms," Journal of Evolutionary Economics, Springer, vol. 27(5), pages 933-961, November.
    8. Stephen C. Graves & Sean P. Willems, 2008. "Strategic Inventory Placement in Supply Chains: Nonstationary Demand," Manufacturing & Service Operations Management, INFORMS, vol. 10(2), pages 278-287, March.
    9. Fernando Vega-Redondo, 1997. "The Evolution of Walrasian Behavior," Econometrica, Econometric Society, vol. 65(2), pages 375-384, March.
    10. Jasmina Arifovic & Michael Maschek, 2006. "Revisiting Individual Evolutionary Learning in the Cobweb Model – An Illustration of the Virtual Spite-Effect," Computational Economics, Springer;Society for Computational Economics, vol. 28(4), pages 333-354, November.
    11. Tunc, Huseyin & Kilic, Onur A. & Tarim, S. Armagan & Eksioglu, Burak, 2011. "The cost of using stationary inventory policies when demand is non-stationary," Omega, Elsevier, vol. 39(4), pages 410-415, August.
    12. Drew Fudenberg & David K. Levine, 1998. "The Theory of Learning in Games," MIT Press Books, The MIT Press, edition 1, volume 1, number 0262061945, December.
    13. Bischi, Gian Italo & Lamantia, Fabio & Radi, Davide, 2015. "An evolutionary Cournot model with limited market knowledge," Journal of Economic Behavior & Organization, Elsevier, vol. 116(C), pages 219-238.
    14. Thomas Riechmann, 2006. "Cournot or Walras? Long-Run Results in Oligopoly Games," Journal of Institutional and Theoretical Economics (JITE), Mohr Siebeck, Tübingen, vol. 162(4), pages 702-720, December.
    15. Steffen Huck & Hans-Theo Normann & Joerg Oechssler, 2004. "Through Trial and Error to Collusion," International Economic Review, Department of Economics, University of Pennsylvania and Osaka University Institute of Social and Economic Research Association, vol. 45(1), pages 205-224, February.
    16. Pai, Mallesh & Hansen, Karsten, 2020. "Algorithmic Collusion: Supra-competitive Prices via Independent Algorithms," CEPR Discussion Papers 14372, C.E.P.R. Discussion Papers.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Kshitija Taywade & Brent Harrison & Adib Bagh, 2022. "Modelling Cournot Games as Multi-agent Multi-armed Bandits," Papers 2201.01182, arXiv.org.
    2. Alós-Ferrer, Carlos & Buckenmaier, Johannes, 2017. "Cournot vs. Walras: A reappraisal through simulations," Journal of Economic Dynamics and Control, Elsevier, vol. 82(C), pages 257-272.
    3. Junyi Xu, 2021. "Reinforcement Learning in a Cournot Oligopoly Model," Computational Economics, Springer;Society for Computational Economics, vol. 58(4), pages 1001-1024, December.
    4. Anufriev, Mikhail & Kopányi, Dávid, 2018. "Oligopoly game: Price makers meet price takers," Journal of Economic Dynamics and Control, Elsevier, vol. 91(C), pages 84-103.
    5. repec:ebl:ecbull:v:4:y:2006:i:29:p:1-8 is not listed on IDEAS
    6. Waltman, Ludo & Kaymak, Uzay, 2008. "Q-learning agents in a Cournot oligopoly model," Journal of Economic Dynamics and Control, Elsevier, vol. 32(10), pages 3275-3293, October.
    7. Thomas Riechmann, 2006. "Mixed motives in a Cournot game," Economics Bulletin, AccessEcon, vol. 4(29), pages 1-8.
    8. Andreas Nicklisch, 2011. "Learning strategic environments: an experimental study of strategy formation and transfer," Theory and Decision, Springer, vol. 71(4), pages 539-558, October.
    9. Arthur Charpentier & Romuald Élie & Carl Remlinger, 2023. "Reinforcement Learning in Economics and Finance," Computational Economics, Springer;Society for Computational Economics, vol. 62(1), pages 425-462, June.
    10. Arifovic, Jasmina & Karaivanov, Alexander, 2010. "Learning by doing vs. learning from others in a principal-agent model," Journal of Economic Dynamics and Control, Elsevier, vol. 34(10), pages 1967-1992, October.
    11. Gian Italo Bischi & Fabio Lamantia & Davide Radi, 2018. "Evolutionary oligopoly games with heterogeneous adaptive players," Chapters, in: Luis C. Corchón & Marco A. Marini (ed.), Handbook of Game Theory and Industrial Organization, Volume I, chapter 12, pages 343-370, Edward Elgar Publishing.
    12. Floortje Alkemade & Han Poutré & Hans Amman, 2006. "Robust Evolutionary Algorithm Design for Socio-economic Simulation," Computational Economics, Springer;Society for Computational Economics, vol. 28(4), pages 355-370, November.
    13. Vallée, Thomas & YIldIzoglu, Murat, 2009. "Convergence in the finite Cournot oligopoly with social and individual learning," Journal of Economic Behavior & Organization, Elsevier, vol. 72(2), pages 670-690, November.
    14. Peter Duersch & Albert Kolb & Jörg Oechssler & Burkhard Schipper, 2010. "Rage against the machines: how subjects play against learning algorithms," Economic Theory, Springer;Society for the Advancement of Economic Theory (SAET), vol. 43(3), pages 407-430, June.
    15. Bergin, James & Bernhardt, Dan, 2009. "Cooperation through imitation," Games and Economic Behavior, Elsevier, vol. 67(2), pages 376-388, November.
    16. Ludo Waltman & Nees Eck & Rommert Dekker & Uzay Kaymak, 2011. "Economic modeling using evolutionary algorithms: the effect of a binary encoding of strategies," Journal of Evolutionary Economics, Springer, vol. 21(5), pages 737-756, December.
    17. Arthur Charpentier & Romuald Elie & Carl Remlinger, 2020. "Reinforcement Learning in Economics and Finance," Papers 2003.10014, arXiv.org.
    18. Alós-Ferrer, Carlos & Ritschel, Alexander, 2021. "Multiple behavioral rules in Cournot oligopolies," Journal of Economic Behavior & Organization, Elsevier, vol. 183(C), pages 250-267.
    19. Waltman, L. & van Eck, N.J.P., 2009. "A Mathematical Analysis of the Long-run Behavior of Genetic Algorithms for Social Modeling," ERIM Report Series Research in Management ERS-2009-011-LIS, Erasmus Research Institute of Management (ERIM), ERIM is the joint research institute of the Rotterdam School of Management, Erasmus University and the Erasmus School of Economics (ESE) at Erasmus University Rotterdam.
    20. Gian Italo Bischi & Fabio Lamantia, 2022. "Evolutionary oligopoly games with cooperative and aggressive behaviors," Journal of Economic Interaction and Coordination, Springer;Society for Economic Science with Heterogeneous Interacting Agents, vol. 17(1), pages 3-27, January.
    21. Davide Radi, 2017. "Walrasian versus Cournot behavior in an oligopoly of boundedly rational firms," Journal of Evolutionary Economics, Springer, vol. 27(5), pages 933-961, November.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2201.00486. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.