IDEAS home Printed from https://ideas.repec.org/a/spr/rvmgts/v15y2021i2d10.1007_s11846-019-00349-0.html
   My bibliography  Save this article

Comparing unsupervised probabilistic machine learning methods for market basket analysis

Author

Listed:
  • Harald Hruschka

    (University of Regensburg)

Abstract

We compare several unsupervised probabilistic machine learning methods for market basket analysis, namely binary factor analysis, two topic models (latent Dirichlet allocation and the correlated topic model), the restricted Boltzmann machine and the deep belief net. After an overview of previous applications of unsupervised probabilistic machine learning methods to market basket analysis we shortly present the methods which we investigate and outline their estimation. Performance is measured by tenfold cross-validated log likelihood values. Binary factor analysis vastly outperforms topic models. The restricted Boltzmann machine attains a similar performance advantage over binary factor analysis. Overall, a deep belief net with 45 variables in the first and 15 variables in the second hidden layers turns out to be the best model. We also compare the investigated machine learning methods with respect to ease of interpretation and runtimes. In addition, we show how to interpret the relationships between hidden variables and observed category purchases. To demonstrate managerial implications we estimate the effect of promoting each category both on purchase probability increases of other product categories and the relative increase of basket size. Finally, we indicate several possibilities to extend restricted Boltzmann machines and deep belief nets for market basket analysis.

Suggested Citation

  • Harald Hruschka, 2021. "Comparing unsupervised probabilistic machine learning methods for market basket analysis," Review of Managerial Science, Springer, vol. 15(2), pages 497-527, February.
  • Handle: RePEc:spr:rvmgts:v:15:y:2021:i:2:d:10.1007_s11846-019-00349-0
    DOI: 10.1007/s11846-019-00349-0
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s11846-019-00349-0
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s11846-019-00349-0?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Boztug, Yasemin & Reutterer, Thomas, 2008. "A combined approach for segment-specific market basket analysis," European Journal of Operational Research, Elsevier, vol. 187(1), pages 294-312, May.
    2. P. Seetharaman & Siddhartha Chib & Andrew Ainslie & Peter Boatwright & Tat Chan & Sachin Gupta & Nitin Mehta & Vithala Rao & Andrei Strijnev, 2005. "Models of Multi-Category Choice Behavior," Marketing Letters, Springer, vol. 16(3), pages 239-254, December.
    3. Grün, Bettina & Hornik, Kurt, 2011. "topicmodels: An R Package for Fitting Topic Models," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 40(i13).
    4. Harald Hruschka, 2017. "Multi-category purchase incidences with marketing cross effects," Review of Managerial Science, Springer, vol. 11(2), pages 443-469, March.
    5. Li Cai, 2010. "High-dimensional Exploratory Item Factor Analysis by A Metropolis–Hastings Robbins–Monro Algorithm," Psychometrika, Springer;The Psychometric Society, vol. 75(1), pages 33-57, March.
    6. Bruno J.D. Jacobs & Bas Donkers & Dennis Fok, 2016. "Model-Based Purchase Predictions for Large Assortments," Marketing Science, INFORMS, vol. 35(3), pages 389-404, May.
    7. Roger Betancourt & David Gautschi, 1990. "Demand Complementarities, Household Production, and Retail Assortments," Marketing Science, INFORMS, vol. 9(2), pages 146-161.
    8. Chalmers, R. Philip, 2012. "mirt: A Multidimensional Item Response Theory Package for the R Environment," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 48(i06).
    9. Thomas Reutterer & Kurt Hornik & Nicolas March & Kathrin Gruber, 2017. "A data mining framework for targeted category promotions," Journal of Business Economics, Springer, vol. 87(3), pages 337-358, April.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Mariflor Vega Carrasco & Ioanna Manolopoulou & Jason O'Sullivan & Rosie Prior & Mirco Musolesi, 2022. "Posterior summaries of grocery retail topic models: Evaluation, interpretability and credibility," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 71(3), pages 562-588, June.
    2. Harald Hruschka, 2022. "Analyzing joint brand purchases by conditional restricted Boltzmann machines," Review of Managerial Science, Springer, vol. 16(4), pages 1117-1145, May.
    3. Andreas Falke & Harald Hruschka, 2022. "Analyzing browsing across websites by machine learning methods," Journal of Business Economics, Springer, vol. 92(5), pages 829-852, July.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Harald Hruschka, 2022. "Analyzing joint brand purchases by conditional restricted Boltzmann machines," Review of Managerial Science, Springer, vol. 16(4), pages 1117-1145, May.
    2. Hruschka, Harald, 2016. "Hidden Variable Models for Market Basket Data. Statistical Performance and Managerial Implications," University of Regensburg Working Papers in Business, Economics and Management Information Systems 489, University of Regensburg, Department of Economics.
    3. Harald Hruschka, 2017. "Multi-category purchase incidences with marketing cross effects," Review of Managerial Science, Springer, vol. 11(2), pages 443-469, March.
    4. Björn Andersson & Tao Xin, 2021. "Estimation of Latent Regression Item Response Theory Models Using a Second-Order Laplace Approximation," Journal of Educational and Behavioral Statistics, , vol. 46(2), pages 244-265, April.
    5. Christopher J. Urban & Daniel J. Bauer, 2021. "A Deep Learning Algorithm for High-Dimensional Exploratory Item Factor Analysis," Psychometrika, Springer;The Psychometric Society, vol. 86(1), pages 1-29, March.
    6. Katrin Dippold & Harald Hruschka, 2013. "Variable selection for market basket analysis," Computational Statistics, Springer, vol. 28(2), pages 519-539, April.
    7. Andreas Falke & Harald Hruschka, 2022. "Analyzing browsing across websites by machine learning methods," Journal of Business Economics, Springer, vol. 92(5), pages 829-852, July.
    8. Justyna Klejdysz & Robin L. Lumsdaine, 2023. "Shifts in ECB Communication: A Textual Analysis of the Press Conference," International Journal of Central Banking, International Journal of Central Banking, vol. 19(2), pages 473-542, June.
    9. Dippold Katrin & Hruschka Harald, 2013. "A Model of Heterogeneous Multicategory Choice for Market Basket Analysis," Review of Marketing Science, De Gruyter, vol. 11(1), pages 1-31, September.
    10. Zhehan Jiang & Jonathan Templin, 2019. "Gibbs Samplers for Logistic Item Response Models via the Pólya–Gamma Distribution: A Computationally Efficient Data-Augmentation Strategy," Psychometrika, Springer;The Psychometric Society, vol. 84(2), pages 358-374, June.
    11. Schröder, Nadine & Falke, Andreas & Hruschka, Harald & Reutterer, Thomas, 2019. "Analyzing the Browsing Basket: A Latent Interests-Based Segmentation Tool," Journal of Interactive Marketing, Elsevier, vol. 47(C), pages 181-197.
    12. Chun Wang, 2015. "On Latent Trait Estimation in Multidimensional Compensatory Item Response Models," Psychometrika, Springer;The Psychometric Society, vol. 80(2), pages 428-449, June.
    13. Yoav Bergner & Peter Halpin & Jill-Jênn Vie, 2022. "Multidimensional Item Response Theory in the Style of Collaborative Filtering," Psychometrika, Springer;The Psychometric Society, vol. 87(1), pages 266-288, March.
    14. Yang Liu & Jan Hannig, 2017. "Generalized Fiducial Inference for Logistic Graded Response Models," Psychometrika, Springer;The Psychometric Society, vol. 82(4), pages 1097-1125, December.
    15. Dippold, Katrin & Hruschka, Harald, 2010. "Variable Selection for Market Basket Analysis," University of Regensburg Working Papers in Business, Economics and Management Information Systems 443, University of Regensburg, Department of Economics.
    16. Harald Hruschka, 2017. "Analyzing the dependences of multi-category purchases on interactions of marketing variables," Journal of Business Economics, Springer, vol. 87(3), pages 295-313, April.
    17. Yunxiao Chen & Xiaoou Li & Siliang Zhang, 2019. "Joint Maximum Likelihood Estimation for High-Dimensional Exploratory Item Factor Analysis," Psychometrika, Springer;The Psychometric Society, vol. 84(1), pages 124-146, March.
    18. Ting Wang & Benjamin Graves & Yves Rosseel & Edgar C. Merkle, 2022. "Computation and application of generalized linear mixed model derivatives using lme4," Psychometrika, Springer;The Psychometric Society, vol. 87(3), pages 1173-1193, September.
    19. Martin Reisenbichler & Thomas Reutterer, 2019. "Topic modeling in marketing: recent advances and research opportunities," Journal of Business Economics, Springer, vol. 89(3), pages 327-356, April.
    20. Keh, Hean Tat & Chu, Singfat, 2003. "Retail productivity and scale economies at the firm level: a DEA approach," Omega, Elsevier, vol. 31(2), pages 75-82, April.

    More about this item

    Keywords

    Machine learning; Market basket analysis; Factor analysis; Topic models; Restricted Boltzmann machine; Deep learning;
    All these keywords.

    JEL classification:

    • M31 - Business Administration and Business Economics; Marketing; Accounting; Personnel Economics - - Marketing and Advertising - - - Marketing
    • L81 - Industrial Organization - - Industry Studies: Services - - - Retail and Wholesale Trade; e-Commerce
    • D12 - Microeconomics - - Household Behavior - - - Consumer Economics: Empirical Analysis
    • C45 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods: Special Topics - - - Neural Networks and Related Topics
    • C89 - Mathematical and Quantitative Methods - - Data Collection and Data Estimation Methodology; Computer Programs - - - Other

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:rvmgts:v:15:y:2021:i:2:d:10.1007_s11846-019-00349-0. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.