IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2202.10678.html
   My bibliography  Save this paper

Sequential Information Design: Markov Persuasion Process and Its Efficient Reinforcement Learning

Author

Listed:
  • Jibang Wu
  • Zixuan Zhang
  • Zhe Feng
  • Zhaoran Wang
  • Zhuoran Yang
  • Michael I. Jordan
  • Haifeng Xu

Abstract

In today's economy, it becomes important for Internet platforms to consider the sequential information design problem to align its long term interest with incentives of the gig service providers. This paper proposes a novel model of sequential information design, namely the Markov persuasion processes (MPPs), where a sender, with informational advantage, seeks to persuade a stream of myopic receivers to take actions that maximizes the sender's cumulative utilities in a finite horizon Markovian environment with varying prior and utility functions. Planning in MPPs thus faces the unique challenge in finding a signaling policy that is simultaneously persuasive to the myopic receivers and inducing the optimal long-term cumulative utilities of the sender. Nevertheless, in the population level where the model is known, it turns out that we can efficiently determine the optimal (resp. $\epsilon$-optimal) policy with finite (resp. infinite) states and outcomes, through a modified formulation of the Bellman equation. Our main technical contribution is to study the MPP under the online reinforcement learning (RL) setting, where the goal is to learn the optimal signaling policy by interacting with with the underlying MPP, without the knowledge of the sender's utility functions, prior distributions, and the Markov transition kernels. We design a provably efficient no-regret learning algorithm, the Optimism-Pessimism Principle for Persuasion Process (OP4), which features a novel combination of both optimism and pessimism principles. Our algorithm enjoys sample efficiency by achieving a sublinear $\sqrt{T}$-regret upper bound. Furthermore, both our algorithm and theory can be applied to MPPs with large space of outcomes and states via function approximation, and we showcase such a success under the linear setting.

Suggested Citation

  • Jibang Wu & Zixuan Zhang & Zhe Feng & Zhaoran Wang & Zhuoran Yang & Michael I. Jordan & Haifeng Xu, 2022. "Sequential Information Design: Markov Persuasion Process and Its Efficient Reinforcement Learning," Papers 2202.10678, arXiv.org.
  • Handle: RePEc:arx:papers:2202.10678
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2202.10678
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Dirk Bergemann & Stephen Morris, 2019. "Information Design: A Unified Perspective," Journal of Economic Literature, American Economic Association, vol. 57(1), pages 44-95, March.
    2. Goldstein, Itay & Leitner, Yaron, 2018. "Stress tests and information disclosure," Journal of Economic Theory, Elsevier, vol. 177(C), pages 34-69.
    3. Emir Kamenica & Matthew Gentzkow, 2011. "Bayesian Persuasion," American Economic Review, American Economic Association, vol. 101(6), pages 2590-2615, October.
    4. Giannoccaro, Ilaria & Pontrandolfo, Pierpaolo, 2002. "Inventory management in supply chains: a reinforcement learning approach," International Journal of Production Economics, Elsevier, vol. 78(2), pages 153-161, July.
    5. Renault, Jérôme & Solan, Eilon & Vieille, Nicolas, 2017. "Optimal dynamic information provision," Games and Economic Behavior, Elsevier, vol. 104(C), pages 329-349.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Krishnamurthy Iyer & Haifeng Xu & You Zu, 2023. "Markov Persuasion Processes with Endogenous Agent Beliefs," Papers 2307.03181, arXiv.org, revised Jul 2023.
    2. Siyu Chen & Jibang Wu & Yifan Wu & Zhuoran Yang, 2023. "Learning to Incentivize Information Acquisition: Proper Scoring Rules Meet Principal-Agent Model," Papers 2303.08613, arXiv.org, revised Aug 2023.
    3. Natalie Collina & Aaron Roth & Han Shao, 2023. "Efficient Prior-Free Mechanisms for No-Regret Agents," Papers 2311.07754, arXiv.org.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Miltiadis Makris & Ludovic Renou, 2018. "Information design in multi-stage games," Working Papers 861, Queen Mary University of London, School of Economics and Finance.
    2. Escudé, Matteo & Sinander, Ludvig, 2023. "Slow persuasion," Theoretical Economics, Econometric Society, vol. 18(1), January.
      • Matteo Escud'e & Ludvig Sinander, 2019. "Slow persuasion," Papers 1903.09055, arXiv.org, revised Apr 2022.
    3. Ozan Candogan & Philipp Strack, 2021. "Optimal Disclosure of Information to a Privately Informed Receiver," Papers 2101.10431, arXiv.org, revised Jan 2022.
    4. Gu, Jiadong, 2023. "Optimal stress tests and liquidation cost," Journal of Economic Dynamics and Control, Elsevier, vol. 146(C).
    5. Leitner, Yaron & Yilmaz, Bilge, 2019. "Regulating a model," Journal of Financial Economics, Elsevier, vol. 131(2), pages 251-268.
    6. Saed Alizamir & Francis de Véricourt & Shouqiang Wang, 2020. "Warning Against Recurring Risks: An Information Design Approach," Management Science, INFORMS, vol. 66(10), pages 4612-4629, October.
    7. Parakhonyak, Alexei & Vikander, Nick, 2023. "Information design through scarcity and social learning," Journal of Economic Theory, Elsevier, vol. 207(C).
    8. Farzaneh Farhadi & Demosthenis Teneketzis, 2022. "Dynamic Information Design: A Simple Problem on Optimal Sequential Information Disclosure," Dynamic Games and Applications, Springer, vol. 12(2), pages 443-484, June.
    9. Babichenko, Yakov & Talgam-Cohen, Inbal & Xu, Haifeng & Zabarnyi, Konstantin, 2022. "Regret-minimizing Bayesian persuasion," Games and Economic Behavior, Elsevier, vol. 136(C), pages 226-248.
    10. Emir Kamenica & Kyungmin Kim & Andriy Zapechelnyuk, 2021. "Bayesian persuasion and information design: perspectives and open issues," Economic Theory, Springer;Society for the Advancement of Economic Theory (SAET), vol. 72(3), pages 701-704, October.
    11. Koessler, Frederic & Laclau, Marie & Renault, Jérôme & Tomala, Tristan, 2022. "Long information design," Theoretical Economics, Econometric Society, vol. 17(2), May.
    12. Li, Fei & Song, Yangbo & Zhao, Mofei, 2023. "Global manipulation by local obfuscation," Journal of Economic Theory, Elsevier, vol. 207(C).
    13. Aleksei Smirnov & Egor Starkov, 2019. "Timing of predictions in dynamic cheap talk: experts vs. quacks," ECON - Working Papers 334, Department of Economics - University of Zurich.
    14. Eduardo Perez‐Richet & Vasiliki Skreta, 2022. "Test Design Under Falsification," Econometrica, Econometric Society, vol. 90(3), pages 1109-1142, May.
    15. Zhao, Wei & Mezzetti, Claudio & Renou, Ludovic & Tomala, Tristan, 0. "Contracting over persistent information," Theoretical Economics, Econometric Society.
    16. Isaiah Andrews & Jesse M. Shapiro, 2021. "A Model of Scientific Communication," Econometrica, Econometric Society, vol. 89(5), pages 2117-2142, September.
    17. Goldstein, Itay & Leitner, Yaron, 2018. "Stress tests and information disclosure," Journal of Economic Theory, Elsevier, vol. 177(C), pages 34-69.
    18. Ding, Haina & Guembel, Alexander & Ozanne, Alessio, 2020. "Market Information in Banking Supervision: The Role of Stress Test Design," TSE Working Papers 20-1144, Toulouse School of Economics (TSE).
    19. Shih-Tang Su & Vijay G. Subramanian & Grant Schoenebeck, 2021. "Bayesian Persuasion in Sequential Trials," Papers 2110.09594, arXiv.org, revised Nov 2021.
    20. Negrelli, Sara, 2020. "Bubbles and persuasion with uncertainty over market sentiment," Games and Economic Behavior, Elsevier, vol. 120(C), pages 67-85.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2202.10678. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.