IDEAS home Printed from https://ideas.repec.org/a/eee/ejores/v310y2023i2p737-750.html
   My bibliography  Save this article

A tractable online learning algorithm for the multinomial logit contextual bandit

Author

Listed:
  • Agrawal, Priyank
  • Tulabandhula, Theja
  • Avadhanula, Vashist

Abstract

In this paper, we consider the contextual variant of the MNL-Bandit problem. More specifically, we consider a dynamic set optimization problem, where a decision-maker offers a subset (assortment) of products to a consumer and observes the response in every round. Consumers purchase products to maximize their utility. We assume that a set of attributes describe the products, and the mean utility of a product is linear in the values of these attributes. We model consumer choice behavior using the widely used Multinomial Logit (MNL) model and consider the decision makers problem of dynamically learning the model parameters while optimizing cumulative revenue over the selling horizon T. Though this problem has recently attracted considerable attention, many existing methods often involve solving an intractable non-convex optimization problem. Their theoretical performance guarantees depend on a problem-dependent parameter which could be prohibitively large. In particular, current algorithms for this problem have regret bounded by O(κdT), where κ is a problem-dependent constant that may have an exponential dependency on the number of attributes, d. In this paper, we propose an optimistic algorithm and show that the regret is bounded by O(dT+κ), significantly improving the performance over existing methods. Further, we propose a convex relaxation of the optimization step, which allows for tractable decision-making while retaining the favorable regret guarantee. We also demonstrate that our algorithm has robust performance for varying κ values through numerical experiments.

Suggested Citation

  • Agrawal, Priyank & Tulabandhula, Theja & Avadhanula, Vashist, 2023. "A tractable online learning algorithm for the multinomial logit contextual bandit," European Journal of Operational Research, Elsevier, vol. 310(2), pages 737-750.
  • Handle: RePEc:eee:ejores:v:310:y:2023:i:2:p:737-750
    DOI: 10.1016/j.ejor.2023.02.036
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0377221723001832
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.ejor.2023.02.036?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Wang, Xiaolin & Zhao, Xiujie & Liu, Bin, 2020. "Design and pricing of extended warranty menus based on the multinomial logit choice model," European Journal of Operational Research, Elsevier, vol. 287(1), pages 237-250.
    2. Flores, Alvaro & Berbeglia, Gerardo & Van Hentenryck, Pascal, 2019. "Assortment optimization under the Sequential Multinomial Logit Model," European Journal of Operational Research, Elsevier, vol. 273(3), pages 1052-1064.
    3. Alfandari, Laurent & Hassanzadeh, Alborz & Ljubić, Ivana, 2021. "An exact method for assortment optimization under the nested logit model," European Journal of Operational Research, Elsevier, vol. 291(3), pages 830-845.
    4. Grant, James A. & Szechtman, Roberto, 2021. "Filtered poisson process bandit on a continuum," European Journal of Operational Research, Elsevier, vol. 295(2), pages 575-586.
    5. Denis Sauré & Assaf Zeevi, 2013. "Optimal Dynamic Assortment Planning with Demand Learning," Manufacturing & Service Operations Management, INFORMS, vol. 15(3), pages 387-404, July.
    6. Xu, Jianyu & Chen, Lujie & Tang, Ou, 2021. "An online algorithm for the risk-aware restless bandit," European Journal of Operational Research, Elsevier, vol. 290(2), pages 622-639.
    7. Laurent Alfandari & Alborz Hassanzadeh & Ivana Ljubić, 2021. "An Exact Method for Assortment Optimization under the Nested Logit Model," Working Papers hal-02463159, HAL.
    8. A. Gürhan Kök & Marshall L. Fisher, 2007. "Demand Estimation and Assortment Optimization Under Substitution: Methodology and Application," Operations Research, INFORMS, vol. 55(6), pages 1001-1021, December.
    9. Paat Rusmevichientong & Zuo-Jun Max Shen & David B. Shmoys, 2010. "Dynamic Assortment Optimization with a Multinomial Logit Choice Model and Capacity Constraint," Operations Research, INFORMS, vol. 58(6), pages 1666-1680, December.
    10. Paat Rusmevichientong & John N. Tsitsiklis, 2010. "Linearly Parameterized Bandits," Mathematics of Operations Research, INFORMS, vol. 35(2), pages 395-411, May.
    11. Shipra Agrawal & Vashist Avadhanula & Vineet Goyal & Assaf Zeevi, 2019. "MNL-Bandit: A Dynamic Learning Approach to Assortment Selection," Operations Research, INFORMS, vol. 67(5), pages 1453-1485, September.
    12. Timonina-Farkas, Anna & Katsifou, Argyro & Seifert, Ralf W., 2020. "Product assortment and space allocation strategies to attract loyal and non-loyal customers," European Journal of Operational Research, Elsevier, vol. 285(3), pages 1058-1076.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Mehrani, Saharnaz & Sefair, Jorge A., 2022. "Robust assortment optimization under sequential product unavailability," European Journal of Operational Research, Elsevier, vol. 303(3), pages 1027-1043.
    2. Shipra Agrawal & Vashist Avadhanula & Vineet Goyal & Assaf Zeevi, 2019. "MNL-Bandit: A Dynamic Learning Approach to Assortment Selection," Operations Research, INFORMS, vol. 67(5), pages 1453-1485, September.
    3. Fernando Bernstein & A. Gürhan Kök & Lei Xie, 2015. "Dynamic Assortment Customization with Limited Inventories," Manufacturing & Service Operations Management, INFORMS, vol. 17(4), pages 538-553, October.
    4. Yining Wang & Xi Chen & Xiangyu Chang & Dongdong Ge, 2021. "Uncertainty Quantification for Demand Prediction in Contextual Dynamic Pricing," Production and Operations Management, Production and Operations Management Society, vol. 30(6), pages 1703-1717, June.
    5. Page, Kenneth & Pérez, Juan & Telha, Claudio & García-Echalar, Andrés & López-Ospina, Héctor, 2021. "Optimal bundle composition in competition for continuous attributes," European Journal of Operational Research, Elsevier, vol. 293(3), pages 1168-1187.
    6. Qiu, Jiaqing & Li, Xiangyong & Duan, Yongrui & Chen, Mengxi & Tian, Peng, 2020. "Dynamic assortment in the presence of brand heterogeneity," Journal of Retailing and Consumer Services, Elsevier, vol. 56(C).
    7. Shaoning Han & Andrés Gómez & Oleg A. Prokopyev, 2022. "Fractional 0–1 programming and submodularity," Journal of Global Optimization, Springer, vol. 84(1), pages 77-93, September.
    8. Boxiao Chen & Xiuli Chao, 2020. "Dynamic Inventory Control with Stockout Substitution and Demand Learning," Management Science, INFORMS, vol. 66(11), pages 5108-5127, November.
    9. Çömez-Dolgan, Nagihan & Fescioglu-Unver, Nilgun & Cephe, Ecem & Şen, Alper, 2021. "Capacitated strategic assortment planning under explicit demand substitution," European Journal of Operational Research, Elsevier, vol. 294(3), pages 1120-1138.
    10. Çömez-Dolgan, Nagihan & Dağ, Hilal & Fescioglu-Unver, Nilgun & Şen, Alper, 2023. "Multi-plant manufacturing assortment planning in the presence of transshipments," European Journal of Operational Research, Elsevier, vol. 310(3), pages 1033-1050.
    11. Xi Chen & Chao Shi & Yining Wang & Yuan Zhou, 2021. "Dynamic Assortment Planning Under Nested Logit Models," Production and Operations Management, Production and Operations Management Society, vol. 30(1), pages 85-102, January.
    12. Mou, Shandong & Robb, David J. & DeHoratius, Nicole, 2018. "Retail store operations: Literature review and research directions," European Journal of Operational Research, Elsevier, vol. 265(2), pages 399-422.
    13. Xi Chen & Yining Wang & Yuan Zhou, 2018. "Dynamic Assortment Optimization with Changing Contextual Information," Papers 1810.13069, arXiv.org, revised Jan 2019.
    14. Hense, Jonas & Hübner, Alexander, 2022. "Assortment optimization in omni-channel retailing," European Journal of Operational Research, Elsevier, vol. 301(1), pages 124-140.
    15. Daria Dzyabura & Srikanth Jagabathula, 2018. "Offline Assortment Optimization in the Presence of an Online Channel," Management Science, INFORMS, vol. 64(6), pages 2767-2786, June.
    16. David Simchi-Levi & Rui Sun & Huanan Zhang, 2022. "Online Learning and Optimization for Revenue Management Problems with Add-on Discounts," Management Science, INFORMS, vol. 68(10), pages 7402-7421, October.
    17. Jalali, Hamed & Carmen, Raïsa & Van Nieuwenhuyse, Inneke & Boute, Robert, 2019. "Quality and pricing decisions in production/inventory systems," European Journal of Operational Research, Elsevier, vol. 272(1), pages 195-206.
    18. Kameng Nip & Zhenbo Wang & Zizhuo Wang, 2021. "Assortment Optimization under a Single Transition Choice Model," Production and Operations Management, Production and Operations Management Society, vol. 30(7), pages 2122-2142, July.
    19. Pol Boada-Collado & Victor Martínez-de-Albéniz, 2020. "Estimating and Optimizing the Impact of Inventory on Consumer Choices in a Fashion Retail Setting," Manufacturing & Service Operations Management, INFORMS, vol. 22(3), pages 582-597, May.
    20. Mika Sumida & Guillermo Gallego & Paat Rusmevichientong & Huseyin Topaloglu & James Davis, 2021. "Revenue-Utility Tradeoff in Assortment Optimization Under the Multinomial Logit Model with Totally Unimodular Constraints," Management Science, INFORMS, vol. 67(5), pages 2845-2869, May.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:ejores:v:310:y:2023:i:2:p:737-750. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/eor .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.