IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2604.14059.html

A Comparative Study of Dynamic Programming and Reinforcement Learning in Finite Horizon Dynamic Pricing

Author

Listed:
  • Lev Razumovskiy
  • Nikolay Karenin

Abstract

This paper provides a systematic comparison between Fitted Dynamic Programming (DP), where demand is estimated from data, and Reinforcement Learning (RL) methods in finite-horizon dynamic pricing problems. We analyze their performance across environments of increasing structural complexity, ranging from a single typology benchmark to multi-typology settings with heterogeneous demand and inter-temporal revenue constraints. Unlike simplified comparisons that restrict DP to low-dimensional settings, we apply dynamic programming in richer, multi-dimensional environments with multiple product types and constraints. We evaluate revenue performance, stability, constraint satisfaction behavior, and computational scaling, highlighting the trade-offs between explicit expectation-based optimization and trajectory-based learning.

Suggested Citation

  • Lev Razumovskiy & Nikolay Karenin, 2026. "A Comparative Study of Dynamic Programming and Reinforcement Learning in Finite Horizon Dynamic Pricing," Papers 2604.14059, arXiv.org.
  • Handle: RePEc:arx:papers:2604.14059
    as

    Download full text from publisher

    File URL: https://arxiv.org/pdf/2604.14059
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Vivek F. Farias & Benjamin Van Roy, 2010. "Dynamic Pricing with a Prior on Market Response," Operations Research, INFORMS, vol. 58(1), pages 16-29, February.
    2. Guillermo Gallego & Garrett van Ryzin, 1994. "Optimal Dynamic Pricing of Inventories with Stochastic Demand over Finite Horizons," Management Science, INFORMS, vol. 40(8), pages 999-1020, August.
    3. Alexander Kastius & Rainer Schlosser, 2022. "Dynamic pricing under competition using reinforcement learning," Journal of Revenue and Pricing Management, Palgrave Macmillan, vol. 21(1), pages 50-63, February.
    4. Omar Besbes & Assaf Zeevi, 2009. "Dynamic Pricing Without Knowing the Demand Function: Risk Bounds and Near-Optimal Algorithms," Operations Research, INFORMS, vol. 57(6), pages 1407-1420, December.
    5. Rana, Rupal & Oliveira, Fernando S., 2014. "Real-time dynamic pricing in a non-stationary environment using model-free reinforcement learning," Omega, Elsevier, vol. 47(C), pages 116-126.
    6. Arnoud V. den Boer & Bert Zwart, 2014. "Simultaneously Learning and Optimizing Using Controlled Variance Pricing," Management Science, INFORMS, vol. 60(3), pages 770-783, March.
    7. Fabian Lange & Leonard Dreessen & Rainer Schlosser, 2025. "Reinforcement learning versus data-driven dynamic programming: a comparison for finite horizon dynamic pricing markets," Journal of Revenue and Pricing Management, Palgrave Macmillan, vol. 24(6), pages 584-600, December.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Athanassios N. Avramidis & Arnoud V. Boer, 2021. "Dynamic pricing with finite price sets: a non-parametric approach," Mathematical Methods of Operations Research, Springer;Gesellschaft für Operations Research (GOR);Nederlands Genootschap voor Besliskunde (NGB), vol. 94(1), pages 1-34, August.
    2. Athanassios N. Avramidis, 2020. "A pricing problem with unknown arrival rate and price sensitivity," Mathematical Methods of Operations Research, Springer;Gesellschaft für Operations Research (GOR);Nederlands Genootschap voor Besliskunde (NGB), vol. 92(1), pages 77-106, August.
    3. den Boer, Arnoud V., 2015. "Tracking the market: Dynamic pricing and learning in a changing environment," European Journal of Operational Research, Elsevier, vol. 247(3), pages 914-927.
    4. Arnoud V. den Boer & N. Bora Keskin, 2020. "Discontinuous Demand Functions: Estimation and Pricing," Management Science, INFORMS, vol. 66(10), pages 4516-4534, October.
    5. Zizhuo Wang & Shiming Deng & Yinyu Ye, 2014. "Close the Gaps: A Learning-While-Doing Algorithm for Single-Product Revenue Management Problems," Operations Research, INFORMS, vol. 62(2), pages 318-331, April.
    6. Yiwei Chen & Vivek F. Farias, 2013. "Simple Policies for Dynamic Pricing with Imperfect Forecasts," Operations Research, INFORMS, vol. 61(3), pages 612-624, June.
    7. Boxiao Chen & Xiuli Chao & Cong Shi, 2021. "Nonparametric Learning Algorithms for Joint Pricing and Inventory Control with Lost Sales and Censored Demand," Mathematics of Operations Research, INFORMS, vol. 46(2), pages 726-756, May.
    8. Xiaocheng Li & Zeyu Zheng, 2024. "Dynamic Pricing with External Information and Inventory Constraint," Management Science, INFORMS, vol. 70(9), pages 5985-6001, September.
    9. Christiane Barz & Jochen Gönsch & Davina Rauhaus & Siqi He, 2025. "Dynamic pricing with (extra) seat reservations under the nested logit model," OR Spectrum: Quantitative Approaches in Management, Springer;Gesellschaft für Operations Research e.V., vol. 47(4), pages 1133-1179, December.
    10. Stefanus Jasin, 2014. "Reoptimization and Self-Adjusting Price Control for Network Revenue Management," Operations Research, INFORMS, vol. 62(5), pages 1168-1178, October.
    11. Gur, Yonatan & Macnamara, Gregory & Saban, Daniela, 2020. "On the Disclosure of Promotion Value in Platforms with Learning Sellers," Research Papers 3865, Stanford University, Graduate School of Business.
    12. Yang, Chaolin & Xiong, Yi, 2020. "Nonparametric advertising budget allocation with inventory constraint," European Journal of Operational Research, Elsevier, vol. 285(2), pages 631-641.
    13. Ruben Geer & Arnoud V. Boer & Christopher Bayliss & Christine S. M. Currie & Andria Ellina & Malte Esders & Alwin Haensel & Xiao Lei & Kyle D. S. Maclean & Antonio Martinez-Sykora & Asbjørn Nilsen Ris, 2019. "Dynamic pricing and learning with competition: insights from the dynamic pricing challenge at the 2017 INFORMS RM & pricing conference," Journal of Revenue and Pricing Management, Palgrave Macmillan, vol. 18(3), pages 185-203, June.
    14. Ibrahim, Michael Nawar & Atiya, Amir F., 2016. "Analytical solutions to the dynamic pricing problem for time-normalized revenue," European Journal of Operational Research, Elsevier, vol. 254(2), pages 632-643.
    15. Yossi Aviv & Mike Mingcheng Wei & Fuqiang Zhang, 2019. "Responsive Pricing of Fashion Products: The Effects of Demand Learning and Strategic Consumer Behavior," Management Science, INFORMS, vol. 65(7), pages 2982-3000, July.
    16. Ningyuan Chen & Guillermo Gallego, 2021. "Nonparametric Pricing Analytics with Customer Covariates," Operations Research, INFORMS, vol. 69(3), pages 974-984, May.
    17. Omar Besbes & Assaf Zeevi, 2012. "Blind Network Revenue Management," Operations Research, INFORMS, vol. 60(6), pages 1537-1550, December.
    18. Michael N. Katehakis & Yifeng Liu & Jian Yang, 2022. "A revisit to the markup practice of irreversible dynamic pricing," Annals of Operations Research, Springer, vol. 317(1), pages 77-105, October.
    19. Gah-Yi Ban & N. Bora Keskin, 2021. "Personalized Dynamic Pricing with Machine Learning: High-Dimensional Features and Heterogeneous Elasticity," Management Science, INFORMS, vol. 67(9), pages 5549-5568, September.
    20. Ruben van de Geer & Arnoud V. den Boer & Christopher Bayliss & Christine Currie & Andria Ellina & Malte Esders & Alwin Haensel & Xiao Lei & Kyle D. S. Maclean & Antonio Martinez-Sykora & Asbj{o}rn Nil, 2018. "Dynamic Pricing and Learning with Competition: Insights from the Dynamic Pricing Challenge at the 2017 INFORMS RM & Pricing Conference," Papers 1804.03219, arXiv.org.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2604.14059. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: https://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.