IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2312.14095.html
   My bibliography  Save this paper

RetailSynth: Synthetic Data Generation for Retail AI Systems Evaluation

Author

Listed:
  • Yu Xia
  • Ali Arian
  • Sriram Narayanamoorthy
  • Joshua Mabry

Abstract

Significant research effort has been devoted in recent years to developing personalized pricing, promotions, and product recommendation algorithms that can leverage rich customer data to learn and earn. Systematic benchmarking and evaluation of these causal learning systems remains a critical challenge, due to the lack of suitable datasets and simulation environments. In this work, we propose a multi-stage model for simulating customer shopping behavior that captures important sources of heterogeneity, including price sensitivity and past experiences. We embedded this model into a working simulation environment -- RetailSynth. RetailSynth was carefully calibrated on publicly available grocery data to create realistic synthetic shopping transactions. Multiple pricing policies were implemented within the simulator and analyzed for impact on revenue, category penetration, and customer retention. Applied researchers can use RetailSynth to validate causal demand models for multi-category retail and to incorporate realistic price sensitivity into emerging benchmarking suites for personalized pricing, promotions, and product recommendations.

Suggested Citation

  • Yu Xia & Ali Arian & Sriram Narayanamoorthy & Joshua Mabry, 2023. "RetailSynth: Synthetic Data Generation for Retail AI Systems Evaluation," Papers 2312.14095, arXiv.org.
  • Handle: RePEc:arx:papers:2312.14095
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2312.14095
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Carlos Fernández-Loría & Foster Provost, 2022. "Rejoinder to “Causal Decision Making and Causal Effect Estimation Are Not the Same…and Why It Matters”," INFORMS Joural on Data Science, INFORMS, vol. 1(1), pages 23-26, April.
    2. Alexander Kastius & Rainer Schlosser, 2022. "Dynamic pricing under competition using reinforcement learning," Journal of Revenue and Pricing Management, Palgrave Macmillan, vol. 21(1), pages 50-63, February.
    3. Rana, Rupal & Oliveira, Fernando S., 2014. "Real-time dynamic pricing in a non-stationary environment using model-free reinforcement learning," Omega, Elsevier, vol. 47(C), pages 116-126.
    4. Francisco J. R. Ruiz & Susan Athey & David M. Blei, 2017. "SHOPPER: A Probabilistic Model of Consumer Choice with Substitutes and Complements," Papers 1711.03560, arXiv.org, revised Jun 2019.
    5. Sebastian Gabel & Artem Timoshenko, 2022. "Product Choice with Large Assortments: A Scalable Deep-Learning Model," Management Science, INFORMS, vol. 68(3), pages 1808-1827, March.
    6. Gui Liberali & Alina Ferecatu, 2022. "Morphing for Consumer Dynamics: Bandits Meet Hidden Markov Models," Marketing Science, INFORMS, vol. 41(4), pages 769-794, July.
    7. Peter S. Fader & Bruce G. S. Hardie & Ka Lok Lee, 2005. "“Counting Your Customers” the Easy Way: An Alternative to the Pareto/NBD Model," Marketing Science, INFORMS, vol. 24(2), pages 275-284, August.
    8. Andrews, Rick L. & Currim, Imran S. & Leeflang, Peter S. H., 2011. "A Comparison of Sales Response Predictions From Demand Models Applied to Store-Level versus Panel Data," Journal of Business & Economic Statistics, American Statistical Association, vol. 29(2), pages 319-326.
    9. Morrison, Donald G & Schmittlein, David C, 1988. "Generalizing the NBD Model for Customer Purchases: What Are the Implications and Is It Worth the Effort? Reply," Journal of Business & Economic Statistics, American Statistical Association, vol. 6(2), pages 165-166, April.
    10. Theja Tulabandhula & Deeksha Sinha & Saketh Reddy Karra & Prasoon Patidar, 2020. "Multi-Purchase Behavior: Modeling, Estimation and Optimization," Papers 2006.08055, arXiv.org, revised Aug 2023.
    11. Carlos Fernández-Loría & Foster Provost, 2022. "Causal Decision Making and Causal Effect Estimation Are Not the Same…and Why It Matters," INFORMS Joural on Data Science, INFORMS, vol. 1(1), pages 4-16, April.
    12. Yeliz Ekinci & Füsun Ulengin & Nimet Uray, 2014. "Using customer lifetime value to plan optimal promotions," The Service Industries Journal, Taylor & Francis Journals, vol. 34(2), pages 103-122, January.
    13. Jeongwen Chiang, 1991. "A Simultaneous Approach to the Whether, What and How Much to Buy Questions," Marketing Science, INFORMS, vol. 10(4), pages 297-315.
    14. Morrison, Donald G & Schmittlein, David C, 1988. "Generalizing the NBD Model for Customer Purchases: What Are the Implications and Is It Worth the Effort?," Journal of Business & Economic Statistics, American Statistical Association, vol. 6(2), pages 145-159, April.
    15. Fader, Peter S. & Hardie, Bruce G.S., 2009. "Probability Models for Customer-Base Analysis," Journal of Interactive Marketing, Elsevier, vol. 23(1), pages 61-69.
    16. Rick L. Andrews & Imran S. Currim & Peter S. H. Leeflang, 2011. "A Comparison of Sales Response Predictions From Demand Models Applied to Store-Level versus Panel Data," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 29(2), pages 319-326, April.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Park, Chang Hee & Park, Young-Hoon & Schweidel, David A., 2014. "A multi-category customer base analysis," International Journal of Research in Marketing, Elsevier, vol. 31(3), pages 266-279.
    2. Reutterer, Thomas & Platzer, Michael & Schröder, Nadine, 2021. "Leveraging purchase regularity for predicting customer behavior the easy way," International Journal of Research in Marketing, Elsevier, vol. 38(1), pages 194-215.
    3. Michael Platzer & Thomas Reutterer, 2016. "Ticking Away the Moments: Timing Regularity Helps to Better Predict Customer Activity," Marketing Science, INFORMS, vol. 35(5), pages 779-799, September.
    4. Peter S. Fader & Bruce G. S. Hardie & Jen Shang, 2010. "Customer-Base Analysis in a Discrete-Time Noncontractual Setting," Marketing Science, INFORMS, vol. 29(6), pages 1086-1108, 11-12.
    5. Mayukh Dass & Masoud Moradi & Fereshteh Zihagh, 2023. "Forecasting purchase rates of new products introduced in existing categories," Journal of Marketing Analytics, Palgrave Macmillan, vol. 11(3), pages 385-408, September.
    6. Meade, Nigel & Islam, Towhidul, 2010. "Using copulas to model repeat purchase behaviour - An exploratory analysis via a case study," European Journal of Operational Research, Elsevier, vol. 200(3), pages 908-917, February.
    7. Fader, Peter S. & Hardie, Bruce G.S., 2009. "Probability Models for Customer-Base Analysis," Journal of Interactive Marketing, Elsevier, vol. 23(1), pages 61-69.
    8. Giang Trinh & Cam Rungie & Malcolm Wright & Carl Driesener & John Dawes, 2014. "Predicting future purchases with the Poisson log-normal model," Marketing Letters, Springer, vol. 25(2), pages 219-234, June.
    9. Chou, Ping & Chuang, Howard Hao-Chun & Chou, Yen-Chun & Liang, Ting-Peng, 2022. "Predictive analytics for customer repurchase: Interdisciplinary integration of buy till you die modeling and machine learning," European Journal of Operational Research, Elsevier, vol. 296(2), pages 635-651.
    10. Anesbury, Zachary William & Talbot, Danielle & Day, Chanel Andrea & Bogomolov, Tim & Bogomolova, Svetlana, 2020. "The fallacy of the heavy buyer: Exploring purchasing frequencies of fresh fruit and vegetable categories," Journal of Retailing and Consumer Services, Elsevier, vol. 53(C).
    11. Eva Ascarza & Scott A. Neslin & Oded Netzer & Zachery Anderson & Peter S. Fader & Sunil Gupta & Bruce G. S. Hardie & Aurélie Lemmens & Barak Libai & David Neal & Foster Provost & Rom Schrift, 2018. "In Pursuit of Enhanced Customer Retention Management: Review, Key Issues, and Future Directions," Customer Needs and Solutions, Springer;Institute for Sustainable Innovation and Growth (iSIG), vol. 5(1), pages 65-81, March.
    12. Bruce G. S. Hardie & Peter S. Fader & Robert Zeithammer, 2003. "Forecasting new product trial in a controlled test market environment," Journal of Forecasting, John Wiley & Sons, Ltd., vol. 22(5), pages 391-410.
    13. Trinh, Giang & Wright, Malcolm J., 2022. "Predicting future consumer purchases in grocery retailing with the condensed Poisson lognormal model," Journal of Retailing and Consumer Services, Elsevier, vol. 64(C).
    14. Valendin, Jan & Reutterer, Thomas & Platzer, Michael & Kalcher, Klaudius, 2022. "Customer base analysis with recurrent neural networks," International Journal of Research in Marketing, Elsevier, vol. 39(4), pages 988-1018.
    15. Lu Yan & Yong Tan, 2014. "Feeling Blue? Go Online: An Empirical Study of Social Support Among Patients," Information Systems Research, INFORMS, vol. 25(4), pages 690-709, December.
    16. Park, Chang Hee & Yoon, Tae Jung, 2022. "The dark side of up-selling promotions: Evidence from an analysis of cross-brand purchase behavior☆," Journal of Retailing, Elsevier, vol. 98(4), pages 647-666.
    17. Romero, Jaime & van der Lans, Ralf & Wierenga, Berend, 2013. "A Partially Hidden Markov Model of Customer Dynamics for CLV Measurement," Journal of Interactive Marketing, Elsevier, vol. 27(3), pages 185-208.
    18. Ehrenberg, Andrew S. C. & Uncles, Mark D. & Goodhardt, Gerald J., 2004. "Understanding brand performance measures: using Dirichlet benchmarks," Journal of Business Research, Elsevier, vol. 57(12), pages 1307-1325, December.
    19. Peter J. Danaher, 2007. "Modeling Page Views Across Multiple Websites with an Application to Internet Reach and Frequency Prediction," Marketing Science, INFORMS, vol. 26(3), pages 422-437, 05-06.
    20. Jörg Peter Heimel & Harald Hruschka & Martin Natter & Alfred Taudes, 1998. "Konnexionistische Kaufakt- und Markenwahlmodelle," Schmalenbach Journal of Business Research, Springer, vol. 50(7), pages 596-613, July.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2312.14095. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.