IDEAS home Printed from https://ideas.repec.org/a/kap/qmktec/v4y2006i3p267-287.html
   My bibliography  Save this article

Data pruning in consumer choice models

Author

Listed:
  • Elaine Zanutto
  • Eric Bradlow

Abstract

Common, if not ubiquitous, Marketing practice when estimating models for scanner panel data is to: (a) observe the data, (b) prune the data to a “manageable” number of brands or SKUs, and (c) fit models to the remaining data. We demonstrate that such pruning practice can lead to significantly different (and potentially biased) elasticities, and hence different managerial/practical outcomes, especially in the context of model misspecification. We first justify our claims theoretically by writing the general problem in a classic missing-data framework and demonstrate that commonly used pruning mechanisms (gleaned from current academic Marketing literature) can lead to a nonignorable missing data mechanism. Secondly, we summarize an extensive set of simulations that were run to understand the driving factors of that bias. The results indicate much greater pruning bias in those cases where model fit is poor (small $$R^{2}$$ ), random utility errors are correlated with the covariates, or the model is misspecified (e.g., a homogeneous logit is specified when a mixed-logit is true). Empirically, we also demonstrate our findings on the well-cited and highly utilized fabric softener data of Fader and Hardie ( 1996 ). Our empirical findings suggest a number of estimates that vary according to the way in which the data is pruned including the magnitude of market mix and attribute elasticities, and purchase probabilities, but that the pruning effect is smaller for better fitting models. Copyright Springer Science + Business Media, LLC 2006

Suggested Citation

  • Elaine Zanutto & Eric Bradlow, 2006. "Data pruning in consumer choice models," Quantitative Marketing and Economics (QME), Springer, vol. 4(3), pages 267-287, September.
  • Handle: RePEc:kap:qmktec:v:4:y:2006:i:3:p:267-287
    DOI: 10.1007/s11129-005-9000-y
    as

    Download full text from publisher

    File URL: http://hdl.handle.net/10.1007/s11129-005-9000-y
    Download Restriction: Access to full text is restricted to subscribers.

    File URL: https://libkey.io/10.1007/s11129-005-9000-y?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Kamel Jedidi & Carl F. Mela & Sunil Gupta, 1999. "Managing Advertising and Promotion for Long-Run Profitability," Marketing Science, INFORMS, vol. 18(1), pages 1-22.
    2. James J. Heckman, 1976. "The Common Structure of Statistical Models of Truncation, Sample Selection and Limited Dependent Variables and a Simple Estimator for Such Models," NBER Chapters, in: Annals of Economic and Social Measurement, Volume 5, number 4, pages 475-492, National Bureau of Economic Research, Inc.
    3. D. Holmes, 1990. "The robustness of the usual correction for restriction in range due to explicit selection," Psychometrika, Springer;The Psychometric Society, vol. 55(1), pages 19-32, March.
    4. Daniel McFadden & Kenneth Train, 2000. "Mixed MNL models for discrete response," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 15(5), pages 447-470.
    5. Peter M. Guadagni & John D. C. Little, 1983. "A Logit Model of Brand Choice Calibrated on Scanner Data," Marketing Science, INFORMS, vol. 2(3), pages 203-238.
    6. P. W. F. Smith & C. J. Skinner & P. S. Clarke, 1999. "Allowing for non‐ignorable non‐response in the analysis of voting intention data," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 48(4), pages 563-577.
    7. David R. Bell & James M. Lattin, 2000. "Looking for Loss Aversion in Scanner Panel Data: The Confounding Effect of Price Response Heterogeneity," Marketing Science, INFORMS, vol. 19(2), pages 185-200, May.
    8. Heckman, James, 2013. "Sample selection bias as a specification error," Applied Econometrics, Russian Presidential Academy of National Economy and Public Administration (RANEPA), vol. 31(3), pages 129-137.
    9. Hausman, Jerry A & Wise, David A, 1977. "Social Experimentation, Truncated Distributions, and Efficient Estimation," Econometrica, Econometric Society, vol. 45(4), pages 919-938, May.
    10. Pradeep K. Chintagunta, 1998. "Inertia and Variety Seeking in a Model of Brand-Purchase Timing," Marketing Science, INFORMS, vol. 17(3), pages 253-270.
    11. Nelson, Forrest D., 1977. "Censored regression models with unobserved, stochastic censoring thresholds," Journal of Econometrics, Elsevier, vol. 6(3), pages 309-327, November.
    12. Heckman, James J, 1974. "Shadow Prices, Market Wages, and Labor Supply," Econometrica, Econometric Society, vol. 42(4), pages 679-694, July.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Sudhir Voleti & Pulak Ghosh, 2013. "A robust approach to measure latent, time-varying equity in hierarchical branding structures," Quantitative Marketing and Economics (QME), Springer, vol. 11(3), pages 289-319, September.
    2. Longxiu Tian & Fred M. Feinberg, 2020. "Optimizing Price Menus for Duration Discounts: A Subscription Selectivity Field Experiment," Marketing Science, INFORMS, vol. 39(6), pages 1181-1198, November.
    3. Song Yao & Carl F. Mela, 2011. "A Dynamic Model of Sponsored Search Advertising," Marketing Science, INFORMS, vol. 30(3), pages 447-468, 05-06.
    4. Christophe Van den Bulte & Raghuram Iyengar, 2011. "Tricked by Truncation: Spurious Duration Dependence and Social Contagion in Hazard Models," Marketing Science, INFORMS, vol. 30(2), pages 233-248, 03-04.
    5. Olga Novikova & Dmitriy B. Potapov, 2015. "Empirical Analysis of Consumer Purchase Behavior: Interaction between State Dependence and Sensitivity to Marketing-Mix Variables," HSE Working papers WP BRP 48/MAN/2015, National Research University Higher School of Economics.
    6. Oliver J. Rutz & Michael Trusov & Randolph E. Bucklin, 2011. "Modeling Indirect Effects of Paid Search Advertising: Which Keywords Lead to More Future Visits?," Marketing Science, INFORMS, vol. 30(4), pages 646-665, July.
    7. Bruno J.D. Jacobs & Bas Donkers & Dennis Fok, 2016. "Model-Based Purchase Predictions for Large Assortments," Marketing Science, INFORMS, vol. 35(3), pages 389-404, May.
    8. Harald J. van Heerde & Shuba Srinivasan & Marnik G. Dekimpe, 2010. "Estimating Cannibalization Rates for Pioneering Innovations," Marketing Science, INFORMS, vol. 29(6), pages 1024-1039, 11-12.
    9. Gijsenberg, Maarten J. & Nijs, Vincent R., 2019. "Advertising spending patterns and competitor impact," International Journal of Research in Marketing, Elsevier, vol. 36(2), pages 232-250.
    10. Nibbering, Didier & Hastie, Trevor J., 2022. "Multiclass-penalized logistic regression," Computational Statistics & Data Analysis, Elsevier, vol. 169(C).
    11. Qing Liu & Thomas Otter & Greg M. Allenby, 2007. "Investigating Endogeneity Bias in Marketing," Marketing Science, INFORMS, vol. 26(5), pages 642-650, 09-10.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Yasser Razak Hussain & Pranab Mukhopadhyay, 2023. "How Much do Education, Experience, and Social Networks Impact Earnings in India? A Panel Data Analysis Disaggregated by Class, Gender, Caste and Religion," SAGE Open, , vol. 13(4), pages 21582440231, December.
    2. E. Michael Foster & Grace Y. Fang, 2004. "Alternative Methods for Handling Attrition," Evaluation Review, , vol. 28(5), pages 434-464, October.
    3. Katrin Hussinger, 2012. "Absorptive capacity and post-acquisition inventor productivity," The Journal of Technology Transfer, Springer, vol. 37(4), pages 490-507, August.
    4. Peter Hull & Michal Kolesár & Christopher Walters, 2022. "Labor by design: contributions of David Card, Joshua Angrist, and Guido Imbens," Scandinavian Journal of Economics, Wiley Blackwell, vol. 124(3), pages 603-645, July.
    5. Martin Huber & Giovanni Mellace, 2014. "Testing exclusion restrictions and additive separability in sample selection models," Empirical Economics, Springer, vol. 47(1), pages 75-92, August.
    6. Lewbel, Arthur, 2007. "Endogenous selection or treatment model estimation," Journal of Econometrics, Elsevier, vol. 141(2), pages 777-806, December.
    7. repec:cty:dpaper:10.1080/07474938.2011.534035 is not listed on IDEAS
    8. James J. Heckman, 2008. "Econometric Causality," International Statistical Review, International Statistical Institute, vol. 76(1), pages 1-27, April.
    9. Ham, John C. & Kagel, John H. & Lehrer, Steven F., 2005. "Randomization, endogeneity and laboratory experiments: the role of cash balances in private value auctions," Journal of Econometrics, Elsevier, vol. 125(1-2), pages 175-205.
    10. John K. Dagsvik & Zhiyang Jia & Tom Kornstad & Thor O. Thoresen, 2014. "Theoretical And Practical Arguments For Modeling Labor Supply As A Choice Among Latent Jobs," Journal of Economic Surveys, Wiley Blackwell, vol. 28(1), pages 134-151, February.
    11. Olson, Kent D. & Elisabeth, Pascal, 2003. "An Economic Assessment Of The Whole-Farm Impact Of Precision Agriculture," 2003 Annual meeting, July 27-30, Montreal, Canada 22119, American Agricultural Economics Association (New Name 2008: Agricultural and Applied Economics Association).
    12. Golder, Stefan M., 2000. "Endowment or Discrimination? An Analysis of Immigrant-Native Earnings Differentials in Switzerland," Kiel Working Papers 967, Kiel Institute for the World Economy (IfW Kiel).
    13. Fahmida Khatun & Syed Yusuf Saadat, 2021. "Returns to Computer Use in Bangladesh: An Econometric Analysis," The Indian Journal of Labour Economics, Springer;The Indian Society of Labour Economics (ISLE), vol. 64(1), pages 175-198, March.
    14. van Soest, A.H.O., 1990. "Essays on micro-econometric models of consumer demand and the labour market," Other publications TiSEM be045d62-a73d-4d7c-a591-f, Tilburg University, School of Economics and Management.
    15. Katrin Hussinger, 2008. "R&D and subsidies at the firm level: an application of parametric and semiparametric two-step selection models," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 23(6), pages 729-747.
    16. Lili Kang & Fei Peng, 2012. "A selection analysis of returns to education in China," Post-Communist Economies, Taylor & Francis Journals, vol. 24(4), pages 535-554, March.
    17. Montes-Rojas, G., 2008. "Robust misspecification tests for the Heckman’s two-step estimator," Working Papers 1479, Department of Economics, City University London.
    18. Fortin, Nicole & Lemieux, Thomas & Firpo, Sergio, 2011. "Decomposition Methods in Economics," Handbook of Labor Economics, in: O. Ashenfelter & D. Card (ed.), Handbook of Labor Economics, edition 1, volume 4, chapter 1, pages 1-102, Elsevier.
    19. Seonho Shin, 2022. "To work or not? Wages or subsidies?: Copula-based evidence of subsidized refugees’ negative selection into employment," Empirical Economics, Springer, vol. 63(4), pages 2209-2252, October.
    20. Alwang, Jeffrey Roger & Stallmann, Judith I., 1992. "Supply And Demand For Married Female Labor: Rural And Urban Differences In The Southern United States," Southern Journal of Agricultural Economics, Southern Agricultural Economics Association, vol. 24(2), pages 1-14, December.
    21. Carson, Richard T. & Louviere, Jordan J., 2014. "Statistical properties of consideration sets," Journal of choice modelling, Elsevier, vol. 13(C), pages 37-48.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:kap:qmktec:v:4:y:2006:i:3:p:267-287. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.