IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0266841.html

Policy search with rare significant events: Choosing the right partner to cooperate with

Author

Listed:
  • Paul Ecoffet
  • Nicolas Fontbonne
  • Jean-Baptiste André
  • Nicolas Bredeche

Abstract

This paper focuses on a class of reinforcement learning problems where significant events are rare and limited to a single positive reward per episode. A typical example is that of an agent who has to choose a partner to cooperate with, while a large number of partners are simply not interested in cooperating, regardless of what the agent has to offer. We address this problem in a continuous state and action space with two different kinds of search methods: a gradient policy search method and a direct policy search method using an evolution strategy. We show that when significant events are rare, gradient information is also scarce, making it difficult for policy gradient search methods to find an optimal policy, with or without a deep neural architecture. On the other hand, we show that direct policy search methods are invariant to the rarity of significant events, which is yet another confirmation of the unique role evolutionary algorithms has to play as a reinforcement learning method.

Suggested Citation

  • Paul Ecoffet & Nicolas Fontbonne & Jean-Baptiste André & Nicolas Bredeche, 2022. "Policy search with rare significant events: Choosing the right partner to cooperate with," PLOS ONE, Public Library of Science, vol. 17(4), pages 1-18, April.
  • Handle: RePEc:plo:pone00:0266841
    DOI: 10.1371/journal.pone.0266841
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0266841
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0266841&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pone.0266841?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. John M. McNamara & Zoltan Barta & Lutz Fromhage & Alasdair I. Houston, 2008. "The coevolution of choosiness and cooperation," Nature, Nature, vol. 451(7175), pages 189-192, January.
    2. Jorgen W. Weibull, 1997. "Evolutionary Game Theory," MIT Press Books, The MIT Press, edition 1, volume 1, number 0262731215, December.
    3. repec:fth:iniesr:487 is not listed on IDEAS
    4. Drew Fudenberg & David K. Levine, 1998. "The Theory of Learning in Games," MIT Press Books, The MIT Press, edition 1, volume 1, number 0262061945, December.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Tom Johnston & Michael Savery & Alex Scott & Bassel Tarbush, 2023. "Game Connectivity and Adaptive Dynamics," Papers 2309.10609, arXiv.org, revised Jun 2026.
    2. Ingela Alger & Laurent Lehmann, 2023. "Evolution of Semi-Kantian Preferences in Two-Player Assortative Interactions with Complete and Incomplete Information and Plasticity," Dynamic Games and Applications, Springer, vol. 13(4), pages 1288-1319, December.
    3. Waters, George A., 2009. "Chaos in the cobweb model with a new learning dynamic," Journal of Economic Dynamics and Control, Elsevier, vol. 33(6), pages 1201-1216, June.
    4. Andreozzi, Luciano, 2013. "Learning to be fair," Journal of Economic Behavior & Organization, Elsevier, vol. 90(C), pages 181-195.
    5. Sandholm,W.H., 2003. "Excess payoff dynamics, potential dynamics, and stable games," Working papers 5, Wisconsin Madison - Social Systems.
    6. Berger, Ulrich, 2016. "Learning to trust, learning to be trustworthy," Department of Economics Working Paper Series 212, WU Vienna University of Economics and Business.
    7. Michel BenaÔm & J–rgen W. Weibull, 2003. "Deterministic Approximation of Stochastic Evolution in Games," Econometrica, Econometric Society, vol. 71(3), pages 873-903, May.
    8. Nobuyuki Hanaki, 2007. "Individual and Social Learning," Computational Economics, Springer;Society for Computational Economics, vol. 29(3), pages 421-421, May.
    9. Sandholm, William H., 2007. "Evolution in Bayesian games II: Stability of purified equilibria," Journal of Economic Theory, Elsevier, vol. 136(1), pages 641-667, September.
    10. Xie, Fang & Horan, Richard D., 2008. "Disease and Behavioral Dynamics for Brucellosis in Elk and Cattle in the Greater Yellowstone Area," 2008 Annual Meeting, July 27-29, 2008, Orlando, Florida 6404, American Agricultural Economics Association (New Name 2008: Agricultural and Applied Economics Association).
    11. Erlei, Mathias, 2008. "Heterogeneous social preferences," Journal of Economic Behavior & Organization, Elsevier, vol. 65(3-4), pages 436-457, March.
    12. Alexander Aurell & Gustav Karreskog, 2020. "Stochastic Stability of a Recency Weighted Sampling Dynamic," Papers 2009.12910, arXiv.org, revised Jun 2021.
    13. Matsui, Akihiko & Oyama, Daisuke, 2006. "Rationalizable foresight dynamics," Games and Economic Behavior, Elsevier, vol. 56(2), pages 299-322, August.
    14. V. Bhaskar & Fernando Vega-Redondo, 1998. "Asynchronous Choice and Markov Equilibria:Theoretical Foundations and Applications," Game Theory and Information 9809003, University Library of Munich, Germany.
    15. Kyle Hyndman & Antoine Terracol & Jonathan Vaksmann, 2009. "Learning and sophistication in coordination games," Experimental Economics, Springer;Economic Science Association, vol. 12(4), pages 450-472, December.
    16. Francesco Squintani, 1999. "Moral Hazard," Discussion Papers 1269, Northwestern University, Center for Mathematical Studies in Economics and Management Science.
    17. Griffin, Christopher & Mummah, Riley & deForest, Russ, 2021. "A finite population destroys a traveling wave in spatial replicator dynamics," Chaos, Solitons & Fractals, Elsevier, vol. 146(C).
    18. John Conley & Myrna H. Wooders & Ali Toossi, 2001. "Evolution & Voting: How Nature Makes us Public Spirited," Economics Bulletin, AccessEcon, vol. 28(24), pages 1.
    19. repec:wvu:wpaper:10-18 is not listed on IDEAS
    20. Sergiu Hart & Andreu Mas-Colell, 2002. "Uncoupled dynamics cannot lead to Nash equilibrium," Discussion Paper Series dp299, The Federmann Center for the Study of Rationality, the Hebrew University, Jerusalem.
    21. Jean Paul Rabanal & Daniel Friedman, 2015. "How Moral Codes Evolve in a Trust Game," Games, MDPI, vol. 6(2), pages 1-11, June.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0266841. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.