IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2105.07959.html
   My bibliography  Save this paper

Choice Set Confounding in Discrete Choice

Author

Listed:
  • Kiran Tomlinson
  • Johan Ugander
  • Austin R. Benson

Abstract

Standard methods in preference learning involve estimating the parameters of discrete choice models from data of selections (choices) made by individuals from a discrete set of alternatives (the choice set). While there are many models for individual preferences, existing learning methods overlook how choice set assignment affects the data. Often, the choice set itself is influenced by an individual's preferences; for instance, a consumer choosing a product from an online retailer is often presented with options from a recommender system that depend on information about the consumer's preferences. Ignoring these assignment mechanisms can mislead choice models into making biased estimates of preferences, a phenomenon that we call choice set confounding; we demonstrate the presence of such confounding in widely-used choice datasets. To address this issue, we adapt methods from causal inference to the discrete choice setting. We use covariates of the chooser for inverse probability weighting and/or regression controls, accurately recovering individual preferences in the presence of choice set confounding under certain assumptions. When such covariates are unavailable or inadequate, we develop methods that take advantage of structured choice set assignment to improve prediction. We demonstrate the effectiveness of our methods on real-world choice data, showing, for example, that accounting for choice set confounding makes choices observed in hotel booking and commute transportation more consistent with rational utility-maximization.

Suggested Citation

  • Kiran Tomlinson & Johan Ugander & Austin R. Benson, 2021. "Choice Set Confounding in Discrete Choice," Papers 2105.07959, arXiv.org, revised Aug 2021.
  • Handle: RePEc:arx:papers:2105.07959
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2105.07959
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Simonson, Itamar, 1989. "Choice Based on Reasons: The Case of Attraction and Compromise Effects," Journal of Consumer Research, Journal of Consumer Research Inc., vol. 16(2), pages 158-174, September.
    2. Stratton, Leslie S. & O'Toole, Dennis M. & Wetzel, James N., 2008. "A multinomial logit model of college stopout and dropout behavior," Economics of Education Review, Elsevier, vol. 27(3), pages 319-331, June.
    3. Train,Kenneth E., 2009. "Discrete Choice Methods with Simulation," Cambridge Books, Cambridge University Press, number 9780521766555.
    4. repec:cup:etheor:v:35:y:2019:i:02:p:233-294_00 is not listed on IDEAS
    5. Bhat, Chandra R. & Gossen, Rachel, 2004. "A mixed multinomial logit model analysis of weekend recreational episode type choice," Transportation Research Part B: Methodological, Elsevier, vol. 38(9), pages 767-787, November.
    6. Brownstone, David & Bunch, David S & Golob, Thomas F & Ren, Weiping, 1996. "A Transactions Choice Model for Forecasting Demand for Alternative-Fuel Vehicles," University of California Transportation Center, Working Papers qt3sm7w9zk, University of California Transportation Center.
    7. Keisuke Hirano & Guido W. Imbens & Geert Ridder, 2003. "Efficient Estimation of Average Treatment Effects Using the Estimated Propensity Score," Econometrica, Econometric Society, vol. 71(4), pages 1161-1189, July.
    8. Wen, Chieh-Hua & Koppelman, Frank S., 2001. "The generalized nested logit model," Transportation Research Part B: Methodological, Elsevier, vol. 35(7), pages 627-641, August.
    9. Huber, Joel & Payne, John W & Puto, Christopher, 1982. "Adding Asymmetrically Dominated Alternatives: Violations of Regularity and the Similarity Hypothesis," Journal of Consumer Research, Journal of Consumer Research Inc., vol. 9(1), pages 90-98, June.
    10. Guido W. Imbens, 2004. "Nonparametric Estimation of Average Treatment Effects Under Exogeneity: A Review," The Review of Economics and Statistics, MIT Press, vol. 86(1), pages 4-29, February.
    11. Manski, Charles F & Lerman, Steven R, 1977. "The Estimation of Choice Probabilities from Choice Based Samples," Econometrica, Econometric Society, vol. 45(8), pages 1977-1988, November.
    12. Heejung Bang & James M. Robins, 2005. "Doubly Robust Estimation in Missing Data and Causal Inference Models," Biometrics, The International Biometric Society, vol. 61(4), pages 962-973, December.
    13. Imbens,Guido W. & Rubin,Donald B., 2015. "Causal Inference for Statistics, Social, and Biomedical Sciences," Cambridge Books, Cambridge University Press, number 9780521885881.
    14. Daniel McFadden & Kenneth Train, 2000. "Mixed MNL models for discrete response," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 15(5), pages 447-470.
    15. Allenby, Greg M. & Rossi, Peter E., 1998. "Marketing models of consumer heterogeneity," Journal of Econometrics, Elsevier, vol. 89(1-2), pages 57-78, November.
    16. Saul Hoffman & Greg Duncan, 1988. "Multinomial and conditional logit discrete-choice models in demography," Demography, Springer;Population Association of America (PAA), vol. 25(3), pages 415-427, August.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Kiran Tomlinson & Austin R. Benson, 2022. "Graph-Based Methods for Discrete Choice," Papers 2205.11365, arXiv.org, revised Nov 2023.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Karlson Pfannschmidt & Pritha Gupta & Bjorn Haddenhorst & Eyke Hullermeier, 2019. "Learning Context-Dependent Choice Functions," Papers 1901.10860, arXiv.org, revised Oct 2021.
    2. Zhexiao Lin & Fang Han, 2022. "On regression-adjusted imputation estimators of the average treatment effect," Papers 2212.05424, arXiv.org, revised Jan 2023.
    3. Arjun Seshadri & Johan Ugander, 2020. "Fundamental Limits of Testing the Independence of Irrelevant Alternatives in Discrete Choice," Papers 2001.07042, arXiv.org.
    4. Yi-Chun Chen & Velibor V. Mišić, 2022. "Decision Forest: A Nonparametric Approach to Modeling Irrational Choice," Management Science, INFORMS, vol. 68(10), pages 7090-7111, October.
    5. Alexandre Belloni & Victor Chernozhukov & Denis Chetverikov & Christian Hansen & Kengo Kato, 2018. "High-dimensional econometrics and regularized GMM," CeMMAP working papers CWP35/18, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    6. Ruoxuan Xiong & Allison Koenecke & Michael Powell & Zhu Shen & Joshua T. Vogelstein & Susan Athey, 2021. "Federated Causal Inference in Heterogeneous Observational Data," Papers 2107.11732, arXiv.org, revised Apr 2023.
    7. Paleti, Rajesh, 2018. "Generalized multinomial probit Model: Accommodating constrained random parameters," Transportation Research Part B: Methodological, Elsevier, vol. 118(C), pages 248-262.
    8. Konrad Menzel, 2021. "Structural Sieves," Papers 2112.01377, arXiv.org, revised Apr 2022.
    9. Jianhua Wang & Jiaye Ge & Yuting Ma, 2018. "Urban Chinese Consumers’ Willingness to Pay for Pork with Certified Labels: A Discrete Choice Experiment," Sustainability, MDPI, vol. 10(3), pages 1-14, February.
    10. Don Fullerton & Li Gan & Miwa Hattori, 2015. "A model to evaluate vehicle emission incentive policies in Japan," Environmental Economics and Policy Studies, Springer;Society for Environmental Economics and Policy Studies - SEEPS, vol. 17(1), pages 79-108, January.
    11. Bhat, Chandra R., 2005. "A multiple discrete-continuous extreme value model: formulation and application to discretionary time-use decisions," Transportation Research Part B: Methodological, Elsevier, vol. 39(8), pages 679-707, September.
    12. Dam, Tien Thanh & Ta, Thuy Anh & Mai, Tien, 2022. "Submodularity and local search approaches for maximum capture problems under generalized extreme value models," European Journal of Operational Research, Elsevier, vol. 300(3), pages 953-965.
    13. Villas-Boas, Sofia B & Taylor, Rebecca & Krovetz, Hannah, 2016. "Willingness to Pay for Low Water Footprint Food Choices During Drought," Department of Agricultural & Resource Economics, UC Berkeley, Working Paper Series qt9vh3x180, Department of Agricultural & Resource Economics, UC Berkeley.
    14. Peter Davis & Pasquale Schiraldi, 2014. "The flexible coefficient multinomial logit (FC-MNL) model of demand for differentiated products," RAND Journal of Economics, RAND Corporation, vol. 45(1), pages 32-63, March.
    15. Michael P. Keane & Nada Wasi, 2013. "The Structure of Consumer Taste Heterogeneity in Revealed vs. Stated Preference Data," Economics Papers 2013-W10, Economics Group, Nuffield College, University of Oxford.
    16. Frölich, Markus & Huber, Martin & Wiesenfarth, Manuel, 2017. "The finite sample performance of semi- and non-parametric estimators for treatment effects and policy evaluation," Computational Statistics & Data Analysis, Elsevier, vol. 115(C), pages 91-102.
    17. Siying Guo & Jianxuan Liu & Qiu Wang, 2022. "Effective Learning During COVID-19: Multilevel Covariates Matching and Propensity Score Matching," Annals of Data Science, Springer, vol. 9(5), pages 967-982, October.
    18. Graham, Bryan S. & Pinto, Cristine Campos de Xavier, 2022. "Semiparametrically efficient estimation of the average linear regression function," Journal of Econometrics, Elsevier, vol. 226(1), pages 115-138.
    19. Laura Grigolon, 2021. "Blurred boundaries: A flexible approach for segmentation applied to the car market," Quantitative Economics, Econometric Society, vol. 12(4), pages 1273-1305, November.
    20. Daina, Nicolò & Sivakumar, Aruna & Polak, John W., 2017. "Modelling electric vehicles use: a survey on the methods," Renewable and Sustainable Energy Reviews, Elsevier, vol. 68(P1), pages 447-460.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2105.07959. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.