IDEAS home Printed from https://ideas.repec.org/a/plo/pcbi00/1013302.html
   My bibliography  Save this article

Maynard Smith revisited: A multi-agent reinforcement learning approach to the coevolution of signalling behaviour

Author

Listed:
  • Olivia Macmillan-Scott
  • Mirco Musolesi

Abstract

The coevolution of signalling is a complex problem within animal behaviour, and is also central to communication between artificial agents. The Sir Philip Sidney game was designed to model this dyadic interaction from an evolutionary biology perspective, and was formulated to demonstrate the emergence of honest signalling. We use Multi-Agent Reinforcement Learning (MARL) to show that in the majority of cases, the resulting behaviour adopted by agents is not that shown in the original derivation of the model. This paper demonstrates that MARL can be a powerful tool to study evolutionary dynamics and understand the underlying mechanisms of learning over generations; particularly advantageous is the interpretability of this type of approach, as well as that fact that it allows us to study emergent behaviour without the need to constrain the strategy space from the outset. Although it originally set out to exemplify honest signalling, we show that the game provides no incentive for such behaviour. In the majority of cases, the optimal outcome is one that does not require a signal for the resource to be given. This type of interaction is observed within animal behaviour, and is sometimes denoted proactive prosociality. High learning and low discount rates of the reinforcement learning model are shown to be optimal in order to achieve the outcome that maximises both agents’ reward, and proximity to the given threshold leads to suboptimal learning.Author summary: When is it too costly for animals to signal that they are in need? Signalling is a crucial part of communication in animal behaviour, and it is also central other types of interactions, such as those involving artificial agents. We study emergent dynamics in the Sir Philip Sidney game, a game designed to show the mechanisms of honest signalling amongst animals. Using multi-agent reinforcement learning (MARL), we replicate generational learning and show that in the majority of scenarios, the optimal outcome is one of proactive prosociality rather than honest signalling: this is an outcome where a resource is given without the need for a costly signal. Such behaviour is observed within animal behaviour, most notably among primates. Our results also establish the usefulness of reinforcement learning as a tool to study emergent behaviour and dynamics within animal behaviour, for instance as shown here to study behavioural changes and learning over generations.

Suggested Citation

  • Olivia Macmillan-Scott & Mirco Musolesi, 2025. "Maynard Smith revisited: A multi-agent reinforcement learning approach to the coevolution of signalling behaviour," PLOS Computational Biology, Public Library of Science, vol. 21(8), pages 1-18, August.
  • Handle: RePEc:plo:pcbi00:1013302
    DOI: 10.1371/journal.pcbi.1013302
    as

    Download full text from publisher

    File URL: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1013302
    Download Restriction: no

    File URL: https://journals.plos.org/ploscompbiol/article/file?id=10.1371/journal.pcbi.1013302&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pcbi.1013302?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcbi00:1013302. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ploscompbiol (email available below). General contact details of provider: https://journals.plos.org/ploscompbiol/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.