MNL-Bandit: A Dynamic Learning Approach to Assortment Selection

My bibliography Save this article

MNL-Bandit: A Dynamic Learning Approach to Assortment Selection

Author

Listed:

Shipra Agrawal
(Department of Industrial Engineering and Operations Research, Fu Foundation School of Engineering and Applied Science, Columbia University, New York, New York 10027)
Vashist Avadhanula
(Decision, Risk, and Operations Division, Columbia Business School, Columbia University, New York, New York 10027)
Vineet Goyal
(Department of Industrial Engineering and Operations Research, Fu Foundation School of Engineering and Applied Science, Columbia University, New York, New York 10027)
Assaf Zeevi
(Decision, Risk, and Operations Division, Columbia Business School, Columbia University, New York, New York 10027)

Registered:

Abstract

We consider a dynamic assortment selection problem where in every round the retailer offers a subset (assortment) of N substitutable products to a consumer, who selects one of these products according to a multinomial logit (MNL) choice model. The retailer observes this choice, and the objective is to dynamically learn the model parameters while optimizing cumulative revenues over a selling horizon of length T . We refer to this exploration–exploitation formulation as the MNL-Bandit problem . Existing methods for this problem follow an explore-then-exploit approach, which estimates parameters to a desired accuracy and then, treating these estimates as if they are the correct parameter values, offers the optimal assortment based on these estimates. These approaches require certain a priori knowledge of “separability,” determined by the true parameters of the underlying MNL model, and this in turn is critical in determining the length of the exploration period. (Separability refers to the distinguishability of the true optimal assortment from the other suboptimal alternatives.) In this paper, we give an efficient algorithm that simultaneously explores and exploits, without a priori knowledge of any problem parameters. Furthermore, the algorithm is adaptive in the sense that its performance is near optimal in the “well-separated” case as well as the general parameter setting where this separation need not hold.

Suggested Citation

Shipra Agrawal & Vashist Avadhanula & Vineet Goyal & Assaf Zeevi, 2019. "MNL-Bandit: A Dynamic Learning Approach to Assortment Selection," Operations Research, INFORMS, vol. 67(5), pages 1453-1485, September.

Handle: RePEc:inm:oropre:v:67:y:2019:i:5:p:1453-1485
DOI: opre.2018.1832

Download full text from publisher

References listed on IDEAS

Felipe Caro & Jérémie Gallien, 2007. "Dynamic Assortment with Demand Learning for Seasonal Consumer Goods," Management Science, INFORMS, vol. 53(2), pages 276-292, February.
Train,Kenneth E., 2009. "Discrete Choice Methods with Simulation," Cambridge Books, Cambridge University Press, number 9780521766555.
- Train,Kenneth E., 2009. "Discrete Choice Methods with Simulation," Cambridge Books, Cambridge University Press, number 9780521747387.
- Kenneth Train, 2003. "Discrete Choice Methods with Simulation," Online economics textbooks, SUNY-Oswego, Department of Economics, number emetr2.
A. Gürhan Kök & Marshall L. Fisher, 2007. "Demand Estimation and Assortment Optimization Under Substitution: Methodology and Application," Operations Research, INFORMS, vol. 55(6), pages 1001-1021, December.
Guillermo Gallego & Huseyin Topaloglu, 2014. "Constrained Assortment Optimization for the Nested Logit Model," Management Science, INFORMS, vol. 60(10), pages 2583-2601, October.
Jose Blanchet & Guillermo Gallego & Vineet Goyal, 2016. "A Markov Chain Approximation to Choice Modeling," Operations Research, INFORMS, vol. 64(4), pages 886-905, August.
Vivek F. Farias & Srikanth Jagabathula & Devavrat Shah, 2013. "A Nonparametric Approach to Modeling Choice with Limited Data," Management Science, INFORMS, vol. 59(2), pages 305-322, December.
Kalyan Talluri & Garrett van Ryzin, 2004. "Revenue Management Under a General Discrete Choice Model of Consumer Behavior," Management Science, INFORMS, vol. 50(1), pages 15-33, January.
James M. Davis & Guillermo Gallego & Huseyin Topaloglu, 2014. "Assortment Optimization Under Variants of the Nested Logit Model," Operations Research, INFORMS, vol. 62(2), pages 250-273, April.
R. L. Plackett, 1975. "The Analysis of Permutations," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 24(2), pages 193-202, June.
Paat Rusmevichientong & Zuo-Jun Max Shen & David B. Shmoys, 2010. "Dynamic Assortment Optimization with a Multinomial Logit Choice Model and Capacity Constraint," Operations Research, INFORMS, vol. 58(6), pages 1666-1680, December.
Paat Rusmevichientong & John N. Tsitsiklis, 2010. "Linearly Parameterized Bandits," Mathematics of Operations Research, INFORMS, vol. 35(2), pages 395-411, May.
Guang Li & Paat Rusmevichientong & Huseyin Topaloglu, 2015. "The d -Level Nested Logit Model: Assortment and Price Optimization Problems," Operations Research, INFORMS, vol. 63(2), pages 325-342, April.

Full references (including those not matched with items on IDEAS)

Citations

Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.

Cited by:

Xi Chen & Chao Shi & Yining Wang & Yuan Zhou, 2021. "Dynamic Assortment Planning Under Nested Logit Models," Production and Operations Management, Production and Operations Management Society, vol. 30(1), pages 85-102, January.
Debjit Roy & Eirini Spiliotopoulou & Jelle de Vries, 2022. "Restaurant analytics: Emerging practice and research opportunities," Production and Operations Management, Production and Operations Management Society, vol. 31(10), pages 3687-3709, October.
Dipankar Das, 2023. "A Model of Competitive Assortment Planning Algorithm," Papers 2307.09479, arXiv.org.
Ilai Bistritz & Amir Leshem, 2021. "Game of Thrones: Fully Distributed Learning for Multiplayer Bandits," Mathematics of Operations Research, INFORMS, vol. 46(1), pages 159-178, February.
Yining Wang & Xi Chen & Xiangyu Chang & Dongdong Ge, 2021. "Uncertainty Quantification for Demand Prediction in Contextual Dynamic Pricing," Production and Operations Management, Production and Operations Management Society, vol. 30(6), pages 1703-1717, June.
Agrawal, Priyank & Tulabandhula, Theja & Avadhanula, Vashist, 2023. "A tractable online learning algorithm for the multinomial logit contextual bandit," European Journal of Operational Research, Elsevier, vol. 310(2), pages 737-750.
Hamsa Bastani & Mohsen Bayati & Khashayar Khosravi, 2021. "Mostly Exploration-Free Algorithms for Contextual Bandits," Management Science, INFORMS, vol. 67(3), pages 1329-1349, March.
Nathan Kallus & Madeleine Udell, 2020. "Dynamic Assortment Personalization in High Dimensions," Operations Research, INFORMS, vol. 68(4), pages 1020-1037, July.
Kris Johnson Ferreira & Joel Goh, 2021. "Assortment Rotation and the Value of Concealment," Management Science, INFORMS, vol. 67(3), pages 1489-1507, March.

Most related items

These are the items that most often cite the same works as this one and are cited by the same works as this one.

Kameng Nip & Zhenbo Wang & Zizhuo Wang, 2021. "Assortment Optimization under a Single Transition Choice Model," Production and Operations Management, Production and Operations Management Society, vol. 30(7), pages 2122-2142, July.
Antoine Désir & Vineet Goyal & Danny Segev & Chun Ye, 2020. "Constrained Assortment Optimization Under the Markov Chain–based Choice Model," Management Science, INFORMS, vol. 66(2), pages 698-721, February.
Xi Chen & Chao Shi & Yining Wang & Yuan Zhou, 2021. "Dynamic Assortment Planning Under Nested Logit Models," Production and Operations Management, Production and Operations Management Society, vol. 30(1), pages 85-102, January.
Ali Aouad & Danny Segev, 2021. "Display Optimization for Vertically Differentiated Locations Under Multinomial Logit Preferences," Management Science, INFORMS, vol. 67(6), pages 3519-3550, June.
Strauss, Arne K. & Klein, Robert & Steinhardt, Claudius, 2018. "A review of choice-based revenue management: Theory and methods," European Journal of Operational Research, Elsevier, vol. 271(2), pages 375-387.
Jacob Feldman & Alice Paul & Huseyin Topaloglu, 2019. "Technical Note—Assortment Optimization with Small Consideration Sets," Operations Research, INFORMS, vol. 67(5), pages 1283-1299, September.
Meng Qi & Ho‐Yin Mak & Zuo‐Jun Max Shen, 2020. "Data‐driven research in retail operations—A review," Naval Research Logistics (NRL), John Wiley & Sons, vol. 67(8), pages 595-616, December.
Jacob B. Feldman & Huseyin Topaloglu, 2017. "Revenue Management Under the Markov Chain Choice Model," Operations Research, INFORMS, vol. 65(5), pages 1322-1342, October.
Mehrani, Saharnaz & Sefair, Jorge A., 2022. "Robust assortment optimization under sequential product unavailability," European Journal of Operational Research, Elsevier, vol. 303(3), pages 1027-1043.
Flores, Alvaro & Berbeglia, Gerardo & Van Hentenryck, Pascal, 2019. "Assortment optimization under the Sequential Multinomial Logit Model," European Journal of Operational Research, Elsevier, vol. 273(3), pages 1052-1064.
Daria Dzyabura & Srikanth Jagabathula, 2018. "Offline Assortment Optimization in the Presence of an Online Channel," Management Science, INFORMS, vol. 64(6), pages 2767-2786, June.
Rui Chen & Hai Jiang, 2020. "Capacitated assortment and price optimization under the nested logit model," Journal of Global Optimization, Springer, vol. 77(4), pages 895-918, August.
Ali Aouad & Vivek Farias & Retsef Levi, 2021. "Assortment Optimization Under Consider-Then-Choose Choice Models," Management Science, INFORMS, vol. 67(6), pages 3368-3386, June.
Nathan Kallus & Madeleine Udell, 2020. "Dynamic Assortment Personalization in High Dimensions," Operations Research, INFORMS, vol. 68(4), pages 1020-1037, July.
Xiao-Yue Gong & Vineet Goyal & Garud N. Iyengar & David Simchi-Levi & Rajan Udwani & Shuangyu Wang, 2022. "Online Assortment Optimization with Reusable Resources," Management Science, INFORMS, vol. 68(7), pages 4772-4785, July.
Ali Aouad & Retsef Levi & Danny Segev, 2019. "Approximation Algorithms for Dynamic Assortment Optimization Models," Mathematics of Operations Research, INFORMS, vol. 44(2), pages 487-511, May.
Çömez-Dolgan, Nagihan & Moussawi-Haidar, Lama & Jaber, Mohamad Y. & Cephe, Ecem, 2022. "Capacitated assortment planning of a multi-location system under transshipments," International Journal of Production Economics, Elsevier, vol. 251(C).
Çömez-Dolgan, Nagihan & Dağ, Hilal & Fescioglu-Unver, Nilgun & Şen, Alper, 2023. "Multi-plant manufacturing assortment planning in the presence of transshipments," European Journal of Operational Research, Elsevier, vol. 310(3), pages 1033-1050.
Alice Paul & Jacob Feldman & James Mario Davis, 2018. "Assortment Optimization and Pricing Under a Nonparametric Tree Choice Model," Manufacturing & Service Operations Management, INFORMS, vol. 20(3), pages 550-565, July.
Guillermo Gallego & Haengju Lee, 2020. "Callable products with dependent demands," Naval Research Logistics (NRL), John Wiley & Sons, vol. 67(3), pages 185-200, April.

More about this item

Keywords

exploration–exploitation; assortment optimization; upper confidence bound; multinomial logit;
All these keywords.

Statistics

Access and download statistics

Corrections

All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:inm:oropre:v:67:y:2019:i:5:p:1453-1485. See general information about how to correct material in RePEc.

If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Chris Asher (email available below). General contact details of provider: https://edirc.repec.org/data/inforea.html .

Please note that corrections may take a couple of weeks to filter through the various RePEc services.

IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.

Browse Econ Literature

More features

MNL-Bandit: A Dynamic Learning Approach to Assortment Selection

Author

Abstract

Suggested Citation

Download full text from publisher

References listed on IDEAS

Citations

Most related items

More about this item

Keywords

Statistics

Corrections

More services and features

MyIDEAS

Author registration

Rankings

RePEc Genealogy

RePEc Biblio

MPRA

New papers by email

EconAcademics

Plagiarism

About RePEc

RePEc home

Blog

Help/FAQ

RePEc team

Participating archives

Privacy statement

Help us

Corrections

Volunteers

Get papers listed

Open a RePEc archive

Get RePEc data