IDEAS home Printed from
MyIDEAS: Log in (now much improved!) to save this paper

Evolutionary Learning in Principal/Agent Models

Listed author(s):
  • Jasmina Arifovic
  • Alex Karaivanov

    (Simon Fraser University)

We introduce learning based on genetic algorithms in a principal-agent model of optimal contracting under moral hazard. Applications corresponding to this setting abound in finance (credit under moral hazard), public finance (optimal taxation, information-constrained insurance), development (sharecropping), mechanism design, etc. It is well known that optimal contracts in principal-agent problems with risk averse agents, unobserved labor effort and stochastic technology can take complicated forms due to the trade-off between provision of incentives and insurance. The optimal contract typically depends on both parties' preferences, the properties of the technology and the stochastic properties of the endowment/income process. The existing literature typically assumes that actions undertaken by the agent are unobserved by the principal while he is in perfect knowledge of realistically much harder (or at least as hard) things to know or observe such as the agent's preferences and decision making process. Few models of how the principal acquires this information exist up to our knowledge. A possible solution that we explore is to explicitly model the principal's learning process about the agent's preferences and/or the production technology based only on observable information such as output realizations (or, in general, messages about them). For simplicity we assume a repeated one-period contracting framework in an output-sharing model which can be thought of as a sharecropping or equity arrangement. An asset owner (principal) contract with an agent to produce jointly. The principal supplies the asset while the agent supplies unobservable labor effort. Output is stochastic and the probability of a given realization depends on the agent's effort. The principal wants to design and implement an optimal compensation scheme for the agent to maximize profits satisfying a participation constraint for the agent. Our primary goal is to investigate whether commonly used evolutionary algorithms lead to convergence to the underlying optimal contract under full rationality as studied by the mechanism design literature (e.g. Hart and Holmstrom, 1986) and if yes, how much time is needed. If on the other hand the optimal contract is never reached, we are interested in whether the learning process instead converges to some simple "rule of thumb" policy as often observed in reality. The exercise we perform can be evaluated from two opposing points of view depending on the reader’s preferences. If commonly used learning algorithms fail to converge to the optimal contract in our simple framework one can interpret this on one hand as posing serious concerns about their applicability but on the other hand (if we believe that people use such algorithms to learn) this can be also interpreted as theory getting too far ahead of reality. Evolutionary algorithms such as genetic algorithms, classifier systems, genetic programming, evolutionary programming, etc. have been widely used in economic applications (see Arifovic, 2000 for a survey of applications in macroeconomics; LeBaron, 1999 for applications in finance; and Dawid, 1999 for a general overview). Many of these applications focus on models of social learning where a population of agents (each represented by a single strategy) evolves over time such that the entire population jointly implements a behavioral algorithm. In other applications (e.g. Arifovic, 1994; Marimon, McGrattan, and Sargent, 1989; Vriend, 2000) genetic algorithms are used in models of individual learning, where evolution takes place on a set of strategies belonging to an individual agent. We investigate the implications of both social and individual learning. First, we study a social learning model where agents update their strategies based on imitating strategies of those agents who performed better in the past and occasionally experimenting with new strategies. Evidence for such behavior in learning about new technologies exists for example in the development literature (Udry, 1994). We also study individual evolutionary learning (Arifovic and Ledyard, 2003) where agents learn only from their own experience. In each time period the agent chooses probabilistically one of the strategies from her set and implements it. The foregone payoffs of all strategies are updated based on the observed outcomes. The strategy set is then updated by reinforcing the frequencies of strategies with relatively high payoffs and by adding new strategies. We also do various robustness checks varying the parameters of the learning algorithms and study the speed of convergence. The results show that social learning converges to the optimal contract under full rationality while individual learning fails. The intuition for the failure of individual learning is that when evaluating foregone payoffs of potential strategies that have not been tried the principal assumes that agent's action will remain constant (as if they play Nash) while in reality the optimal contract involves an optimal response to the agent's best response function as in a Stackelberg setting. The inability of individual learning to produce correct payoffs for the principal's strategies undermines its convergence to the optimal profit maximizing contract. In contrast, social learning involves evaluating only strategy payoffs that have been actually implemented by some principals in the economy thus circumventing the above problem. This failure of the model of individual learning where foregone payoffs are taken into account is in stark contrast to the findings reported in the existing literature. Various studies (see, for example, Camerer and Ho, 1998; Camerer, 2003; Arifovic and Ledyard, 2004) find that the performance of these models, when evaluated against evidence from experiments with human subjects, is superior to the performance of the learning models where only actual strategy payoffs are taken into account

To our knowledge, this item is not available for download. To find whether it is available, there are three options:
1. Check below under "Related research" whether another version of this item is available online.
2. Check on the provider's web page whether it is in fact available.
3. Perform a search for a similarly titled item that would be available.

Paper provided by Society for Computational Economics in its series Computing in Economics and Finance 2006 with number 140.

in new window

Date of creation: 04 Jul 2006
Handle: RePEc:sce:scecfa:140
Contact details of provider: Web page:

More information through EDIRC

No references listed on IDEAS
You can help add them by filling out this form.

This item is not listed on Wikipedia, on a reading list or among the top items on IDEAS.

When requesting a correction, please mention this item's handle: RePEc:sce:scecfa:140. See general information about how to correct material in RePEc.

For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: (Christopher F. Baum)

If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

If references are entirely missing, you can add them using this form.

If the full references list an item that is present in RePEc, but the system did not link to it, you can help with this form.

If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your profile, as there may be some citations waiting for confirmation.

Please note that corrections may take a couple of weeks to filter through the various RePEc services.

This information is provided to you by IDEAS at the Research Division of the Federal Reserve Bank of St. Louis using RePEc data.