Maynard Smith revisited: A multi-agent reinforcement learning approach to the coevolution of signalling behaviour

My bibliography Save this article

Maynard Smith revisited: A multi-agent reinforcement learning approach to the coevolution of signalling behaviour

Author

Listed:

Olivia Macmillan-Scott
Mirco Musolesi

Registered:

Abstract

The coevolution of signalling is a complex problem within animal behaviour, and is also central to communication between artificial agents. The Sir Philip Sidney game was designed to model this dyadic interaction from an evolutionary biology perspective, and was formulated to demonstrate the emergence of honest signalling. We use Multi-Agent Reinforcement Learning (MARL) to show that in the majority of cases, the resulting behaviour adopted by agents is not that shown in the original derivation of the model. This paper demonstrates that MARL can be a powerful tool to study evolutionary dynamics and understand the underlying mechanisms of learning over generations; particularly advantageous is the interpretability of this type of approach, as well as that fact that it allows us to study emergent behaviour without the need to constrain the strategy space from the outset. Although it originally set out to exemplify honest signalling, we show that the game provides no incentive for such behaviour. In the majority of cases, the optimal outcome is one that does not require a signal for the resource to be given. This type of interaction is observed within animal behaviour, and is sometimes denoted proactive prosociality. High learning and low discount rates of the reinforcement learning model are shown to be optimal in order to achieve the outcome that maximises both agents’ reward, and proximity to the given threshold leads to suboptimal learning.Author summary: When is it too costly for animals to signal that they are in need? Signalling is a crucial part of communication in animal behaviour, and it is also central other types of interactions, such as those involving artificial agents. We study emergent dynamics in the Sir Philip Sidney game, a game designed to show the mechanisms of honest signalling amongst animals. Using multi-agent reinforcement learning (MARL), we replicate generational learning and show that in the majority of scenarios, the optimal outcome is one of proactive prosociality rather than honest signalling: this is an outcome where a resource is given without the need for a costly signal. Such behaviour is observed within animal behaviour, most notably among primates. Our results also establish the usefulness of reinforcement learning as a tool to study emergent behaviour and dynamics within animal behaviour, for instance as shown here to study behavioural changes and learning over generations.

Suggested Citation

Olivia Macmillan-Scott & Mirco Musolesi, 2025. "Maynard Smith revisited: A multi-agent reinforcement learning approach to the coevolution of signalling behaviour," PLOS Computational Biology, Public Library of Science, vol. 21(8), pages 1-18, August.

Handle: RePEc:plo:pcbi00:1013302
DOI: 10.1371/journal.pcbi.1013302

Download full text from publisher

More about this item

Statistics

Access and download statistics

Corrections

All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcbi00:1013302. See general information about how to correct material in RePEc.

If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

We have no bibliographic references for this item. You can help adding them by using this form .

If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ploscompbiol (email available below). General contact details of provider: https://journals.plos.org/ploscompbiol/ .

Please note that corrections may take a couple of weeks to filter through the various RePEc services.

IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.

Browse Econ Literature

More features

Maynard Smith revisited: A multi-agent reinforcement learning approach to the coevolution of signalling behaviour

Author

Abstract

Suggested Citation

Download full text from publisher

More about this item

Statistics

Corrections

More services and features

MyIDEAS

Author registration

Rankings

RePEc Genealogy

RePEc Biblio

MPRA

New papers by email

EconAcademics

Plagiarism

About RePEc

RePEc home

Blog

Help/FAQ

RePEc team

Participating archives

Privacy statement

Help us

Corrections

Volunteers

Get papers listed

Open a RePEc archive

Get RePEc data