IDEAS home Printed from https://ideas.repec.org/a/kap/qmktec/v4y2006i3p267-287.html
   My bibliography  Save this article

Data pruning in consumer choice models

Author

Listed:
  • Elaine Zanutto
  • Eric Bradlow

Abstract

Common, if not ubiquitous, Marketing practice when estimating models for scanner panel data is to: (a) observe the data, (b) prune the data to a “manageable” number of brands or SKUs, and (c) fit models to the remaining data. We demonstrate that such pruning practice can lead to significantly different (and potentially biased) elasticities, and hence different managerial/practical outcomes, especially in the context of model misspecification. We first justify our claims theoretically by writing the general problem in a classic missing-data framework and demonstrate that commonly used pruning mechanisms (gleaned from current academic Marketing literature) can lead to a nonignorable missing data mechanism. Secondly, we summarize an extensive set of simulations that were run to understand the driving factors of that bias. The results indicate much greater pruning bias in those cases where model fit is poor (small $$R^{2}$$ ), random utility errors are correlated with the covariates, or the model is misspecified (e.g., a homogeneous logit is specified when a mixed-logit is true). Empirically, we also demonstrate our findings on the well-cited and highly utilized fabric softener data of Fader and Hardie ( 1996 ). Our empirical findings suggest a number of estimates that vary according to the way in which the data is pruned including the magnitude of market mix and attribute elasticities, and purchase probabilities, but that the pruning effect is smaller for better fitting models. Copyright Springer Science + Business Media, LLC 2006

Suggested Citation

  • Elaine Zanutto & Eric Bradlow, 2006. "Data pruning in consumer choice models," Quantitative Marketing and Economics (QME), Springer, vol. 4(3), pages 267-287, September.
  • Handle: RePEc:kap:qmktec:v:4:y:2006:i:3:p:267-287
    DOI: 10.1007/s11129-005-9000-y
    as

    Download full text from publisher

    File URL: http://hdl.handle.net/10.1007/s11129-005-9000-y
    Download Restriction: Access to full text is restricted to subscribers.

    File URL: https://libkey.io/10.1007/s11129-005-9000-y?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Peter M. Guadagni & John D. C. Little, 1983. "A Logit Model of Brand Choice Calibrated on Scanner Data," Marketing Science, INFORMS, vol. 2(3), pages 203-238.
    2. Hausman, Jerry A & Wise, David A, 1977. "Social Experimentation, Truncated Distributions, and Efficient Estimation," Econometrica, Econometric Society, vol. 45(4), pages 919-938, May.
    3. Kamel Jedidi & Carl F. Mela & Sunil Gupta, 1999. "Managing Advertising and Promotion for Long-Run Profitability," Marketing Science, INFORMS, vol. 18(1), pages 1-22.
    4. James J. Heckman, 1976. "The Common Structure of Statistical Models of Truncation, Sample Selection and Limited Dependent Variables and a Simple Estimator for Such Models," NBER Chapters, in: Annals of Economic and Social Measurement, Volume 5, number 4, pages 475-492, National Bureau of Economic Research, Inc.
    5. Pradeep K. Chintagunta, 1998. "Inertia and Variety Seeking in a Model of Brand-Purchase Timing," Marketing Science, INFORMS, vol. 17(3), pages 253-270.
    6. Nelson, Forrest D., 1977. "Censored regression models with unobserved, stochastic censoring thresholds," Journal of Econometrics, Elsevier, vol. 6(3), pages 309-327, November.
    7. Heckman, James, 2013. "Sample selection bias as a specification error," Applied Econometrics, Russian Presidential Academy of National Economy and Public Administration (RANEPA), vol. 31(3), pages 129-137.
    8. P. W. F. Smith & C. J. Skinner & P. S. Clarke, 1999. "Allowing for non‐ignorable non‐response in the analysis of voting intention data," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 48(4), pages 563-577.
    9. D. Holmes, 1990. "The robustness of the usual correction for restriction in range due to explicit selection," Psychometrika, Springer;The Psychometric Society, vol. 55(1), pages 19-32, March.
    10. Heckman, James J, 1974. "Shadow Prices, Market Wages, and Labor Supply," Econometrica, Econometric Society, vol. 42(4), pages 679-694, July.
    11. David R. Bell & James M. Lattin, 2000. "Looking for Loss Aversion in Scanner Panel Data: The Confounding Effect of Price Response Heterogeneity," Marketing Science, INFORMS, vol. 19(2), pages 185-200, May.
    12. Daniel McFadden & Kenneth Train, 2000. "Mixed MNL models for discrete response," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 15(5), pages 447-470.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Christophe Van den Bulte & Raghuram Iyengar, 2011. "Tricked by Truncation: Spurious Duration Dependence and Social Contagion in Hazard Models," Marketing Science, INFORMS, vol. 30(2), pages 233-248, 03-04.
    2. Qing Liu & Thomas Otter & Greg M. Allenby, 2007. "Investigating Endogeneity Bias in Marketing," Marketing Science, INFORMS, vol. 26(5), pages 642-650, 09-10.
    3. Sudhir Voleti & Pulak Ghosh, 2013. "A robust approach to measure latent, time-varying equity in hierarchical branding structures," Quantitative Marketing and Economics (QME), Springer, vol. 11(3), pages 289-319, September.
    4. Longxiu Tian & Fred M. Feinberg, 2020. "Optimizing Price Menus for Duration Discounts: A Subscription Selectivity Field Experiment," Marketing Science, INFORMS, vol. 39(6), pages 1181-1198, November.
    5. Song Yao & Carl F. Mela, 2011. "A Dynamic Model of Sponsored Search Advertising," Marketing Science, INFORMS, vol. 30(3), pages 447-468, 05-06.
    6. Olga Novikova & Dmitriy B. Potapov, 2015. "Empirical Analysis of Consumer Purchase Behavior: Interaction between State Dependence and Sensitivity to Marketing-Mix Variables," HSE Working papers WP BRP 48/MAN/2015, National Research University Higher School of Economics.
    7. Oliver J. Rutz & Michael Trusov & Randolph E. Bucklin, 2011. "Modeling Indirect Effects of Paid Search Advertising: Which Keywords Lead to More Future Visits?," Marketing Science, INFORMS, vol. 30(4), pages 646-665, July.
    8. Bruno J.D. Jacobs & Bas Donkers & Dennis Fok, 2016. "Model-Based Purchase Predictions for Large Assortments," Marketing Science, INFORMS, vol. 35(3), pages 389-404, May.
    9. Harald J. van Heerde & Shuba Srinivasan & Marnik G. Dekimpe, 2010. "Estimating Cannibalization Rates for Pioneering Innovations," Marketing Science, INFORMS, vol. 29(6), pages 1024-1039, 11-12.
    10. Gijsenberg, Maarten J. & Nijs, Vincent R., 2019. "Advertising spending patterns and competitor impact," International Journal of Research in Marketing, Elsevier, vol. 36(2), pages 232-250.
    11. Nibbering, Didier & Hastie, Trevor J., 2022. "Multiclass-penalized logistic regression," Computational Statistics & Data Analysis, Elsevier, vol. 169(C).

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. James J. Heckman, 2008. "Econometric Causality," International Statistical Review, International Statistical Institute, vol. 76(1), pages 1-27, April.
    2. Bernhard Baumgartner & Daniel Guhl & Thomas Kneib & Winfried J. Steiner, 2018. "Flexible estimation of time-varying effects for frequently purchased retail goods: a modeling approach based on household panel data," OR Spectrum: Quantitative Approaches in Management, Springer;Gesellschaft für Operations Research e.V., vol. 40(4), pages 837-873, October.
    3. Martin Huber, 2012. "Identification of Average Treatment Effects in Social Experiments Under Alternative Forms of Attrition," Journal of Educational and Behavioral Statistics, , vol. 37(3), pages 443-474, June.
    4. Yasser Razak Hussain & Pranab Mukhopadhyay, 2023. "How Much do Education, Experience, and Social Networks Impact Earnings in India? A Panel Data Analysis Disaggregated by Class, Gender, Caste and Religion," SAGE Open, , vol. 13(4), pages 21582440231, December.
    5. E. Michael Foster & Grace Y. Fang, 2004. "Alternative Methods for Handling Attrition," Evaluation Review, , vol. 28(5), pages 434-464, October.
    6. Katrin Hussinger, 2012. "Absorptive capacity and post-acquisition inventor productivity," The Journal of Technology Transfer, Springer, vol. 37(4), pages 490-507, August.
    7. William Goetzmann & Liang Peng, 2003. "Estimating Indices in the Presence of Seller Reservation Prices," Yale School of Management Working Papers ysm352, Yale School of Management, revised 01 May 2003.
    8. Arthur Lewbel, 2005. "Simple Endogenous Binary Choice and Selection Panel Model Estimators," Boston College Working Papers in Economics 613, Boston College Department of Economics, revised 04 Sep 2006.
    9. Gayle, George-Levi & Viauroux, Christelle, 2007. "Root-N consistent semiparametric estimators of a dynamic panel-sample-selection model," Journal of Econometrics, Elsevier, vol. 141(1), pages 179-212, November.
    10. Peter Hull & Michal Kolesár & Christopher Walters, 2022. "Labor by design: contributions of David Card, Joshua Angrist, and Guido Imbens," Scandinavian Journal of Economics, Wiley Blackwell, vol. 124(3), pages 603-645, July.
    11. Martin Huber & Giovanni Mellace, 2014. "Testing exclusion restrictions and additive separability in sample selection models," Empirical Economics, Springer, vol. 47(1), pages 75-92, August.
    12. Angrist, Joshua D., 1997. "Conditional independence in sample selection models," Economics Letters, Elsevier, vol. 54(2), pages 103-112, February.
    13. Joseph Clougherty & Tomaso Duso, 2015. "Correcting for Self-Selection Based Endogeneity in Management Research: A Review and Empirical Demonstration," Discussion Papers of DIW Berlin 1465, DIW Berlin, German Institute for Economic Research.
    14. Lewbel, Arthur, 2007. "Endogenous selection or treatment model estimation," Journal of Econometrics, Elsevier, vol. 141(2), pages 777-806, December.
    15. Gabriel Montes-Rojas, 2011. "Robust Misspecification Tests for the Heckman's Two-Step Estimator," Econometric Reviews, Taylor & Francis Journals, vol. 30(2), pages 154-172.
    16. Ham, John C. & Kagel, John H. & Lehrer, Steven F., 2005. "Randomization, endogeneity and laboratory experiments: the role of cash balances in private value auctions," Journal of Econometrics, Elsevier, vol. 125(1-2), pages 175-205.
    17. John K. Dagsvik & Zhiyang Jia & Tom Kornstad & Thor O. Thoresen, 2014. "Theoretical And Practical Arguments For Modeling Labor Supply As A Choice Among Latent Jobs," Journal of Economic Surveys, Wiley Blackwell, vol. 28(1), pages 134-151, February.
    18. Olson, Kent D. & Elisabeth, Pascal, 2003. "An Economic Assessment Of The Whole-Farm Impact Of Precision Agriculture," 2003 Annual meeting, July 27-30, Montreal, Canada 22119, American Agricultural Economics Association (New Name 2008: Agricultural and Applied Economics Association).
    19. Golder, Stefan M., 2000. "Endowment or Discrimination? An Analysis of Immigrant-Native Earnings Differentials in Switzerland," Kiel Working Papers 967, Kiel Institute for the World Economy (IfW Kiel).
    20. Andreas Tsakiridis & Michael Wallace & James Breen & Cathal O'Donoghue & Kevin Hanrahan, 2021. "Beef quality assurance schemes: Can they improve farm economic performance?," Agribusiness, John Wiley & Sons, Ltd., vol. 37(3), pages 451-471, July.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:kap:qmktec:v:4:y:2006:i:3:p:267-287. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.