Multi-Agent Reinforcement Learning for Dynamic Pricing in Supply Chains: Benchmarking Strategic Agent Behaviours under Realistically Simulated Market Conditions

My bibliography Save this paper

Multi-Agent Reinforcement Learning for Dynamic Pricing in Supply Chains: Benchmarking Strategic Agent Behaviours under Realistically Simulated Market Conditions

Author

Listed:

Thomas Hazenberg
Yao Ma
Seyed Sahand Mohammadi Ziabari
Marijn van Rijswijk

Registered:

Abstract

This study investigates how Multi-Agent Reinforcement Learning (MARL) can improve dynamic pricing strategies in supply chains, particularly in contexts where traditional ERP systems rely on static, rule-based approaches that overlook strategic interactions among market actors. While recent research has applied reinforcement learning to pricing, most implementations remain single-agent and fail to model the interdependent nature of real-world supply chains. This study addresses that gap by evaluating the performance of three MARL algorithms: MADDPG, MADQN, and QMIX against static rule-based baselines, within a simulated environment informed by real e-commerce transaction data and a LightGBM demand prediction model. Results show that rule-based agents achieve near-perfect fairness (Jain's Index: 0.9896) and the highest price stability (volatility: 0.024), but they fully lack competitive dynamics. Among MARL agents, MADQN exhibits the most aggressive pricing behaviour, with the highest volatility and the lowest fairness (0.5844). MADDPG provides a more balanced approach, supporting market competition (share volatility: 9.5 pp) while maintaining relatively high fairness (0.8819) and stable pricing. These findings suggest that MARL introduces emergent strategic behaviour not captured by static pricing rules and may inform future developments in dynamic pricing.

Suggested Citation

Thomas Hazenberg & Yao Ma & Seyed Sahand Mohammadi Ziabari & Marijn van Rijswijk, 2025. "Multi-Agent Reinforcement Learning for Dynamic Pricing in Supply Chains: Benchmarking Strategic Agent Behaviours under Realistically Simulated Market Conditions," Papers 2507.02698, arXiv.org.

Handle: RePEc:arx:papers:2507.02698

Download full text from publisher

References listed on IDEAS

Lidia Ceriani & Paolo Verme, 2012. "The origins of the Gini index: extracts from Variabilità e Mutabilità (1912) by Corrado Gini," The Journal of Economic Inequality, Springer;Society for the Study of Economic Inequality, vol. 10(3), pages 421-443, September.
Weron, Rafał, 2014. "Electricity price forecasting: A review of the state-of-the-art with a look into the future," International Journal of Forecasting, Elsevier, vol. 30(4), pages 1030-1081.
- Rafal Weron, 2014. "Electricity price forecasting: A review of the state-of-the-art with a look into the future," HSC Research Reports HSC/14/07, Hugo Steinhaus Center, Wroclaw University of Science and Technology.
Anthony B. Atkinson, 1999. "The Contributions of Amartya Sen to Welfare Economics," Scandinavian Journal of Economics, Wiley Blackwell, vol. 101(2), pages 173-190, June.
Hajji, Adnéne & Pellerin, Robert & Léger, Pierre-Majorique & Gharbi, Ali & Babin, Gilbert, 2012. "Dynamic pricing models for ERP systems under network externality," International Journal of Production Economics, Elsevier, vol. 135(2), pages 708-715.
Shafiee, Shahriar & Topal, Erkan, 2010. "An overview of global gold market and gold price forecasting," Resources Policy, Elsevier, vol. 35(3), pages 178-189, September.
Vincent R. Nijs & Shuba Srinivasan & Koen Pauwels, 2007. "Retail-Price Drivers and Retailer Profits," Marketing Science, INFORMS, vol. 26(4), pages 473-487, 07-08.

Full references (including those not matched with items on IDEAS)

Most related items

These are the items that most often cite the same works as this one and are cited by the same works as this one.

Lavička, Hynek & Kracík, Jiří, 2020. "Fluctuation analysis of electric power loads in Europe: Correlation multifractality vs. Distribution function multifractality," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 545(C).
Smyl, Slawek, 2020. "A hybrid method of exponential smoothing and recurrent neural networks for time series forecasting," International Journal of Forecasting, Elsevier, vol. 36(1), pages 75-85.
Mina Baliamoune-Lutz, 2004. "On the Measurement of Human Well-being: Fuzzy Set Theory and Sen's Capability Approach," WIDER Working Paper Series RP2004-16, World Institute for Development Economic Research (UNU-WIDER).
Oded Stark & Wiktor Budzinski, 2021. "A social‐psychological reconstruction of Amartya Sen’s measures of inequality and social welfare," Kyklos, Wiley Blackwell, vol. 74(4), pages 552-566, November.
- Stark, Oded & Budzinski, Wiktor, "undated". "A social-psychological reconstruction of Amartya Sen’s measures of inequality and social welfare," Discussion Papers 313522, University of Bonn, Center for Development Research (ZEF).
- Stark, Oded & Budzinski, Wiktor, 2021. "A Social-Psychological Reconstruction of Amartya Sen's Measures of Inequality and Social Welfare," IZA Discussion Papers 14761, Institute of Labor Economics (IZA).
- Stark, Oded & Budzinski, Wiktor, 2021. "A social-psychological reconstruction of Amartya Sen's measures of inequality and social welfare," University of Tübingen Working Papers in Business and Economics 151, University of Tuebingen, Faculty of Economics and Social Sciences, School of Business and Economics.
Miguel Ángel Rodríguez López & Diego Rodríguez Rodríguez, 2024. "La aplicación de datos masivos en economía de la energía: una revisión," Working Papers 2024-08, FEDEA.
Billé, Anna Gloria & Gianfreda, Angelica & Del Grosso, Filippo & Ravazzolo, Francesco, 2023. "Forecasting electricity prices with expert, linear, and nonlinear models," International Journal of Forecasting, Elsevier, vol. 39(2), pages 570-586.
Uniejewski, Bartosz & Maciejowska, Katarzyna, 2023. "LASSO principal component averaging: A fully automated approach for point forecast pooling," International Journal of Forecasting, Elsevier, vol. 39(4), pages 1839-1852.
- Bartosz Uniejewski & Katarzyna Maciejowska, 2022. "LASSO Principal Component Averaging -- a fully automated approach for point forecast pooling," Papers 2207.04794, arXiv.org.
Benedikt Finnah, 2022. "Optimal bidding functions for renewable energies in sequential electricity markets," OR Spectrum: Quantitative Approaches in Management, Springer;Gesellschaft für Operations Research e.V., vol. 44(1), pages 1-27, March.
Soysal, Emilie Rosenlund, 2025. "Market-based wind power investments under financial frictions," Applied Energy, Elsevier, vol. 391(C).
Kopalle, Praveen K. & Pauwels, Koen & Akella, Laxminarayana Yashaswy & Gangwar, Manish, 2023. "Dynamic pricing: Definition, implications for managers, and future research directions," Journal of Retailing, Elsevier, vol. 99(4), pages 580-593.
Raza, Syed Ali & Masood, Amna & Benkraiem, Ramzi & Urom, Christian, 2023. "Forecasting the volatility of precious metals prices with global economic policy uncertainty in pre and during the COVID-19 period: Novel evidence from the GARCH-MIDAS approach," Energy Economics, Elsevier, vol. 120(C).
- Syed Ali Raza & Amna Masood & Ramzi Benkraiem & Christian Urom, 2023. "Forecasting the volatility of precious metals prices with global economic policy uncertainty in pre and during the COVID-19 period: Novel evidence from the GARCH-MIDAS approach," Post-Print hal-04080872, HAL.
Yajing Gao & Xiaojie Zhou & Jiafeng Ren & Zheng Zhao & Fushen Xue, 2018. "Electricity Purchase Optimization Decision Based on Data Mining and Bayesian Game," Energies, MDPI, vol. 11(5), pages 1-19, April.
Afanasyev, Dmitriy O. & Fedorova, Elena A., 2019. "On the impact of outlier filtering on the electricity price forecasting accuracy," Applied Energy, Elsevier, vol. 236(C), pages 196-210.
Simon Pezzutto & Gianluca Grilli & Stefano Zambotti & Stefan Dunjic, 2018. "Forecasting Electricity Market Price for End Users in EU28 until 2020—Main Factors of Influence," Energies, MDPI, vol. 11(6), pages 1-18, June.
repec:dui:wpaper:1504 is not listed on IDEAS
Hossein Hassani & Emmanuel Sirimal Silva & Rangan Gupta & Mawuli K. Segnon, 2015. "Forecasting the price of gold," Applied Economics, Taylor & Francis Journals, vol. 47(39), pages 4141-4152, August.
- Hossein Hassani & Emmanuel Sirimal Silva & Rangan Gupta & Mawuli K. Segnon, 2014. "Forecasting the Price of Gold," Working Papers 201428, University of Pretoria, Department of Economics.
Zhang, Hong & Nguyen, Hoang & Bui, Xuan-Nam & Pradhan, Biswajeet & Mai, Ngoc-Luan & Vu, Diep-Anh, 2021. "Proposing two novel hybrid intelligence models for forecasting copper price based on extreme learning machine and meta-heuristic algorithms," Resources Policy, Elsevier, vol. 73(C).
Goodarzi, Shadi & Perera, H. Niles & Bunn, Derek, 2019. "The impact of renewable energy forecast errors on imbalance volumes and electricity spot prices," Energy Policy, Elsevier, vol. 134(C).
Xiaofeng Lv & Gupeng Zhang & Guangyu Ren, 2017. "Gini index estimation for lifetime data," Lifetime Data Analysis: An International Journal Devoted to Statistical Methods and Applications for Time-to-Event Data, Springer, vol. 23(2), pages 275-304, April.
Grzegorz Marcjasz & Tomasz Serafin & Rafał Weron, 2018. "Selection of Calibration Windows for Day-Ahead Electricity Price Forecasting," Energies, MDPI, vol. 11(9), pages 1-20, September.
- Grzegorz Marcjasz & Tomasz Serafin & Rafal Weron, 2018. "Selection of calibration windows for day-ahead electricity price forecasting," HSC Research Reports HSC/18/06, Hugo Steinhaus Center, Wroclaw University of Science and Technology.
Wilkinson, Sam & Maticka, Martin J. & Liu, Yue & John, Michele, 2021. "The duck curve in a drying pond: The impact of rooftop PV on the Western Australian electricity market transition," Utilities Policy, Elsevier, vol. 71(C).

More about this item

NEP fields

This paper has been announced in the following NEP Reports:

NEP-CMP-2025-08-25 (Computational Economics)
NEP-GTH-2025-08-25 (Game Theory)

Statistics

Access and download statistics

Corrections

All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2507.02698. See general information about how to correct material in RePEc.

If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

Please note that corrections may take a couple of weeks to filter through the various RePEc services.

IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.

Browse Econ Literature

More features

Multi-Agent Reinforcement Learning for Dynamic Pricing in Supply Chains: Benchmarking Strategic Agent Behaviours under Realistically Simulated Market Conditions

Author

Abstract

Suggested Citation

Download full text from publisher

References listed on IDEAS

Most related items

More about this item

NEP fields

Statistics

Corrections

More services and features

MyIDEAS

Author registration

Rankings

RePEc Genealogy

RePEc Biblio

MPRA

New papers by email

EconAcademics

Plagiarism

About RePEc

RePEc home

Blog

Help/FAQ

RePEc team

Participating archives

Privacy statement

Help us

Corrections

Volunteers

Get papers listed

Open a RePEc archive

Get RePEc data