IDEAS home Printed from https://ideas.repec.org/p/ces/ceswps/_10695.html
   My bibliography  Save this paper

Demand Estimation with Text and Image Data

Author

Listed:
  • Giovanni Compiani
  • Ilya Morozov
  • Stephan Seiler

Abstract

We propose a demand estimation method that allows researchers to estimate substitution patterns from unstructured image and text data. We first employ a series of machine learning models to measure product similarity from products’ images and textual descriptions. We then estimate a nested logit model with product-pair specific nesting parameters that depend on the image and text similarities between products. Our framework does not require collecting product attributes for each category and can capture product similarity along dimensions that are hard to account for with observed attributes. We apply our method to a dataset describing the behavior of Amazon shoppers across several categories and show that incorporating texts and images in demand estimation helps us recover a flexible cross-price elasticity matrix.

Suggested Citation

  • Giovanni Compiani & Ilya Morozov & Stephan Seiler, 2023. "Demand Estimation with Text and Image Data," CESifo Working Paper Series 10695, CESifo.
  • Handle: RePEc:ces:ceswps:_10695
    as

    Download full text from publisher

    File URL: https://www.cesifo.org/DocDL/cesifo1_wp10695.pdf
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Small, Kenneth A, 1987. "A Discrete Choice Model for Ordered Alternatives," Econometrica, Econometric Society, vol. 55(2), pages 409-424, March.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Laura Battaglia & Timothy Christensen & Stephen Hansen & Szymon Sacher, 2024. "Inference for Regression with Variables Generated from Unstructured Data," Papers 2402.15585, arXiv.org, revised Mar 2024.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Paleti, Rajesh, 2018. "Generalized multinomial probit Model: Accommodating constrained random parameters," Transportation Research Part B: Methodological, Elsevier, vol. 118(C), pages 248-262.
    2. Lu, Xiao-Yun & Gosling, Geoffrey D. & Shladover, Steven E. & Xiong, Jing & Ceder, Avi, 2006. "Development of a Modeling Framework for Analyzing Improvements in Intermodal Connectivity at California Airports," Institute of Transportation Studies, Research Reports, Working Papers, Proceedings qt586755r9, Institute of Transportation Studies, UC Berkeley.
    3. Dam, Tien Thanh & Ta, Thuy Anh & Mai, Tien, 2022. "Submodularity and local search approaches for maximum capture problems under generalized extreme value models," European Journal of Operational Research, Elsevier, vol. 300(3), pages 953-965.
    4. Harris, Mark N. & Ramful, Preety & Zhao, Xueyan, 2006. "An ordered generalised extreme value model with application to alcohol consumption in Australia," Journal of Health Economics, Elsevier, vol. 25(4), pages 782-801, July.
    5. Marzano, Vittorio & Papola, Andrea, 2008. "On the covariance structure of the Cross-Nested Logit model," Transportation Research Part B: Methodological, Elsevier, vol. 42(2), pages 83-98, February.
    6. Kai-Lung Hui, 2004. "Product Variety Under Brand Influence: An Empirical Investigation of Personal Computer Demand," Management Science, INFORMS, vol. 50(5), pages 686-700, May.
    7. Peter Davis & Pasquale Schiraldi, 2014. "The flexible coefficient multinomial logit (FC-MNL) model of demand for differentiated products," RAND Journal of Economics, RAND Corporation, vol. 45(1), pages 32-63, March.
    8. Emerson Melo, 2021. "Learning in Random Utility Models Via Online Decision Problems," Papers 2112.10993, arXiv.org, revised Aug 2022.
    9. Pereira, Pedro & Ribeiro, Tiago & Vareda, João, 2013. "Delineating markets for bundles with consumer level data: The case of triple-play," International Journal of Industrial Organization, Elsevier, vol. 31(6), pages 760-773.
    10. Newman, Jeffrey P. & Lurkin, Virginie & Garrow, Laurie A., 2018. "Computational methods for estimating multinomial, nested, and cross-nested logit models that account for semi-aggregate data," Journal of choice modelling, Elsevier, vol. 26(C), pages 28-40.
    11. David Dale & Andrei Sirchenko, 2021. "Estimation of nested and zero-inflated ordered probit models," Stata Journal, StataCorp LP, vol. 21(1), pages 3-38, March.
    12. Ghader, Sepehr & Carrion, Carlos & Zhang, Lei, 2019. "Autoregressive continuous logit: Formulation and application to time-of-day choice modeling," Transportation Research Part B: Methodological, Elsevier, vol. 123(C), pages 240-257.
    13. Martin, Elliott William, 2009. "New Vehicle Choice, Fuel Economy and Vehicle Incentives: An Analysis of Hybrid Tax Credits and the Gasoline Tax," University of California Transportation Center, Working Papers qt5gd206wv, University of California Transportation Center.
    14. Laura Grigolon, 2021. "Blurred boundaries: A flexible approach for segmentation applied to the car market," Quantitative Economics, Econometric Society, vol. 12(4), pages 1273-1305, November.
    15. José-Benito Pérez-López & Margarita Novales & Francisco-Alberto Varela-García & Alfonso Orro, 2020. "Residential Location Econometric Choice Modeling with Irregular Zoning: Common Border Spatial Correlation Metric," Networks and Spatial Economics, Springer, vol. 20(3), pages 785-802, September.
    16. Andrew Daly & Stephane Hess & Geoff Hyman & John Polak & Charlene Rohr, 2005. "Modelling departure time and mode choice," ERSA conference papers ersa05p688, European Regional Science Association.
    17. Mogens Fosgerau & Julien Monardo & André de Palma, 2019. "The Inverse Product Differentiation Logit Model," Working Papers hal-02183411, HAL.
    18. Timothy F. Bresnahan & Scott Stern & Manuel Trajtenberg, 1995. "Market Segmentation and the Sources of Rents from Innovation: Personal Computers in the Late 1980s," Working Papers 95001, Stanford University, Department of Economics.
    19. Tinessa, Fiore & Marzano, Vittorio & Papola, Andrea, 2020. "Mixing distributions of tastes with a Combination of Nested Logit (CoNL) kernel: Formulation and performance analysis," Transportation Research Part B: Methodological, Elsevier, vol. 141(C), pages 1-23.
    20. Joao Macieira & Pedro Pereira & Joao Vareda, 2013. "Bundling Incentives in Markets with Product Complementarities: The Case of Triple-Play," Working Papers 13-15, NET Institute.

    More about this item

    Keywords

    demand estimation; unstructured data; computer vision; text models;
    All these keywords.

    JEL classification:

    • C10 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General - - - General
    • C50 - Mathematical and Quantitative Methods - - Econometric Modeling - - - General
    • C81 - Mathematical and Quantitative Methods - - Data Collection and Data Estimation Methodology; Computer Programs - - - Methodology for Collecting, Estimating, and Organizing Microeconomic Data; Data Access

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:ces:ceswps:_10695. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Klaus Wohlrabe (email available below). General contact details of provider: https://edirc.repec.org/data/cesifde.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.