IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2508.16245.html

Limit-Computable Grains of Truth for Arbitrary Computable Extensive-Form (Un)Known Games

Author

Listed:
  • Cole Wyeth
  • Marcus Hutter
  • Jan Leike
  • Jessica Taylor

Abstract

A Bayesian player acting in an infinite multi-player game learns to predict the other players' strategies if his prior assigns positive probability to their play (or contains a grain of truth). Kalai and Lehrer's classic grain of truth problem is to find a reasonably large class of strategies that contains the Bayes-optimal policies with respect to this class, allowing mutually-consistent beliefs about strategy choice that obey the rules of Bayesian inference. Only small classes are known to have a grain of truth and the literature contains several related impossibility results. In this paper we present a formal and general solution to the full grain of truth problem: we construct a class of strategies wide enough to contain all computable strategies as well as Bayes-optimal strategies for every reasonable prior over the class. When the "environment" is a known repeated stage game, we show convergence in the sense of [KL93a] and [KL93b]. When the environment is unknown, agents using Thompson sampling converge to play $\varepsilon$-Nash equilibria in arbitrary unknown computable multi-agent environments. Finally, we include an application to self-predictive policies that avoid planning. While these results use computability theory only as a conceptual tool to solve a classic game theory problem, we show that our solution can naturally be computationally approximated arbitrarily closely.

Suggested Citation

  • Cole Wyeth & Marcus Hutter & Jan Leike & Jessica Taylor, 2025. "Limit-Computable Grains of Truth for Arbitrary Computable Extensive-Form (Un)Known Games," Papers 2508.16245, arXiv.org.
  • Handle: RePEc:arx:papers:2508.16245
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2508.16245
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Dean Foster & H Peyton Young, 1999. "On the Impossibility of Predicting the Behavior of Rational Agents," Economics Working Paper Archive 423, The Johns Hopkins University,Department of Economics, revised Jun 2001.
    2. John H. Nachbar, 2005. "Beliefs in Repeated Games," Econometrica, Econometric Society, vol. 73(2), pages 459-480, March.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Norman, Thomas W.L., 2022. "The possibility of Bayesian learning in repeated games," Games and Economic Behavior, Elsevier, vol. 136(C), pages 142-152.
    2. Burkhard C. Schipper, 2022. "Strategic Teaching and Learning in Games," American Economic Journal: Microeconomics, American Economic Association, vol. 14(3), pages 321-352, August.
    3. Sami Al-Suwailem, 2012. "Complexity and Endogenous Instability," ASSRU Discussion Papers 1203, ASSRU - Algorithmic Social Science Research Unit.
    4. Chernov, G. & Susin, I., 2019. "Models of learning in games: An overview," Journal of the New Economic Association, New Economic Association, vol. 44(4), pages 77-125.
    5. Norman, Thomas W.L., 2015. "Learning, hypothesis testing, and rational-expectations equilibrium," Games and Economic Behavior, Elsevier, vol. 90(C), pages 93-105.
    6. Al-Suwailem, Sami, 2014. "Complexity and endogenous instability," Research in International Business and Finance, Elsevier, vol. 30(C), pages 393-410.
    7. Dean P Foster & Peyton Young, 2006. "Regret Testing Leads to Nash Equilibrium," Levine's Working Paper Archive 784828000000000676, David K. Levine.
    8. Burkhard Schipper, 2015. "Strategic teaching and learning in games," Working Papers 151, University of California, Davis, Department of Economics.
    9. Pop Gabriel & Milencianu Mircea & Pop Alexandra, 2025. "The Second Axelrod Tournament: A Monte Carlo Exploration of Uncertainty About the Number of Rounds in Iterated Prisoner’s Dilemma," Studia Universitatis Babeș-Bolyai Oeconomica, Sciendo, vol. 70(1), pages 67-82.
    10. Thomas Norman, 2012. "Almost-Rational Learning of Nash Equilibrium without Absolute Continuity," Economics Series Working Papers 602, University of Oxford, Department of Economics.
    11. Yakov Babichenko, 2010. "Completely Uncoupled Dynamics and Nash Equilibria," Discussion Paper Series dp529, The Federmann Center for the Study of Rationality, the Hebrew University, Jerusalem.
    12. Leoni Patrick L, 2009. "A Constructive Proof that Learning in Repeated Games Leads to Nash Equilibria," The B.E. Journal of Theoretical Economics, De Gruyter, vol. 8(1), pages 1-20, January.
    13. Jonathan Newton, 2018. "Evolutionary Game Theory: A Renaissance," Games, MDPI, vol. 9(2), pages 1-67, May.
    14. Georges, Christophre, 2006. "Learning with misspecification in an artificial currency market," Journal of Economic Behavior & Organization, Elsevier, vol. 60(1), pages 70-84, May.
    15. Tsionas, Mike G., 2023. "Bayesian learning in performance. Is there any?," European Journal of Operational Research, Elsevier, vol. 311(1), pages 263-282.
    16. Mathevet, Laurent, 2018. "An axiomatization of plays in repeated games," Games and Economic Behavior, Elsevier, vol. 110(C), pages 19-31.
    17. Thomas Norman, 2012. "Learning Within Rational-Expectations Equilibrium," Economics Series Working Papers 591, University of Oxford, Department of Economics.
    18. Joshua M. Epstein & Ross A. Hammond, 2001. "Non-Explanatory Equilibria: An Extremely Simple Game With (Mostly) Unattainable Fixed Points," Working Papers 01-08-043, Santa Fe Institute.
    19. Jindani, Sam, 2022. "Learning efficient equilibria in repeated games," Journal of Economic Theory, Elsevier, vol. 205(C).
    20. Scott E. Page, 2008. "Uncertainty, Difficulty, and Complexity," Journal of Theoretical Politics, , vol. 20(2), pages 115-149, April.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2508.16245. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.