Massively Categorical Variables: Revealing the Information in Zip Codes

Massively Categorical Variables: Revealing the Information in Zip Codes

Author

Listed:

Thomas J. Steenburgh
(Yale University, New Haven, Connecticut 06520)
Andrew Ainslie
(University of California, Los Angeles, Los Angeles, California 90095)
Peder Hans Engebretson
(ClearInfo, Denver, Colorado)

Registered:

Thomas Steenburgh ?

Abstract

We introduce the idea of a massively categorical variable, a variable such as zip code that takes on too many values to treat in the standard manner. We show how to use a massively categorical variable directly as an explanatory variable. As an application of this concept, we explore several of the issues that analysts confront when trying to develop a direct marketing campaign. We begin by pointing out that the data contained in many of the common sources are masked through aggregation in order to protect consumer privacy. This creates some difficulty when trying to construct models of individual level behavior. We show how to take full advantage of such data through a hierarchical Bayesian variance components (HBVC) model. The flexibility of our approach allows us to combine several sources of information, some of which may not be aggregated, in a coherent manner. We show that the conventional modeling practice understates the uncertainty with regard to its parameter values. We explore an array of financial considerations, including ones in which the marginal benefit is non-linear, to make robust model comparisons. To implement the decision rules that determine the optimal number of prospects to contact, we develop an algorithm based on the Monte Carlo Markov chain output from parameter estimation. We conclude the analysis by demonstrating how to determine an organization's willingness to pay for additional data.

Suggested Citation

Thomas J. Steenburgh & Andrew Ainslie & Peder Hans Engebretson, 2003. "Massively Categorical Variables: Revealing the Information in Zip Codes," Marketing Science, INFORMS, vol. 22(1), pages 40-57, August.

Handle: RePEc:inm:ormksc:v:22:y:2003:i:1:p:40-57
DOI: 10.1287/mksc.22.1.40.12847

Download full text from publisher

References listed on IDEAS

Peter E. Rossi & Robert E. McCulloch & Greg M. Allenby, 1996. "The Value of Purchase History Data in Target Marketing," Marketing Science, INFORMS, vol. 15(4), pages 321-340.
A. Gelman & Y. Goegebeur & F. Tuerlinckx & I. Van Mechelen, 2000. "Diagnostic checks for discrete data regression models using posterior predictive simulations," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 49(2), pages 247-268.
Arthur Hsu & Ronald T. Wilcox, 2000. "Stochastic Prediction in Multinomial Logit Models," Management Science, INFORMS, vol. 46(8), pages 1137-1144, August.
Jan Roelf Bult & Tom Wansbeek, 1995. "Optimal Selection for Direct Mail," Marketing Science, INFORMS, vol. 14(4), pages 378-394.

Full references (including those not matched with items on IDEAS)

Citations

Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.

Cited by:

Grinstein, Amir & Wathieu, Luc, 2012. "Happily (mal)adjusted: Cosmopolitan identity and expatriate adjustment," International Journal of Research in Marketing, Elsevier, vol. 29(4), pages 337-345.
Wieringa, Jaap & Kannan, P.K. & Ma, Xiao & Reutterer, Thomas & Risselada, Hans & Skiera, Bernd, 2021. "Data analytics in a privacy-concerned world," Journal of Business Research, Elsevier, vol. 122(C), pages 915-925.
André Bonfrer & Xavier Drèze, 2009. "Real-Time Evaluation of E-mail Campaign Performance," Marketing Science, INFORMS, vol. 28(2), pages 251-263, 03-04.
Matthew Nagler, 2006. "An exploratory analysis of the determinants of cooperative advertising participation rates," Marketing Letters, Springer, vol. 17(2), pages 91-102, April.
M. Ballings & D. Van Den Poel & E. Verhagen, 2013. "Evaluating the Added Value of Pictorial Data for Customer Churn Prediction," Working Papers of Faculty of Economics and Business Administration, Ghent University, Belgium 13/869, Ghent University, Faculty of Economics and Business Administration.
Baecke, Philippe & De Baets, Shari & Vanderheyden, Karlien, 2017. "Investigating the added value of integrating human judgement into statistical demand forecasting systems," International Journal of Production Economics, Elsevier, vol. 191(C), pages 85-96.
Steven M. Shugan, 2004. "The Impact of Advancing Technology on Marketing and Academic Research," Marketing Science, INFORMS, vol. 23(4), pages 469-475.
van Dijk, Bram & Paap, Richard, 2008. "Explaining individual response using aggregated data," Journal of Econometrics, Elsevier, vol. 146(1), pages 1-9, September.
- Paap, R. & van Dijk, A., 2006. "Explaining individual response using aggregated data," Econometric Institute Research Papers EI 2006-05, Erasmus University Rotterdam, Erasmus School of Economics (ESE), Econometric Institute.
P. Baecke & D. Van Den Poel, 2012. "Including Spatial Interdependence in Customer Acquisition Models: a Cross-Category Comparison," Working Papers of Faculty of Economics and Business Administration, Ghent University, Belgium 12/788, Ghent University, Faculty of Economics and Business Administration.
Philippe Baecke & Dirk Van Den Poel, 2010. "Improving Purchasing Behavior Predictions By Data Augmentation With Situational Variables," International Journal of Information Technology & Decision Making (IJITDM), World Scientific Publishing Co. Pte. Ltd., vol. 9(06), pages 853-872.
- P. Baecke & D. Van Den Poel, 2010. "Improving purchasing behavior predictions by data augmentation with situational variables," Working Papers of Faculty of Economics and Business Administration, Ghent University, Belgium 10/658, Ghent University, Faculty of Economics and Business Administration.
Matthew J. Schneider & Sharan Jagpal & Sachin Gupta & Shaobo Li & Yan Yu, 2018. "A Flexible Method for Protecting Marketing Data: An Application to Point-of-Sale Data," Marketing Science, INFORMS, vol. 37(1), pages 153-171, January.
Kelvyn Jones & Dewi Owen & Ron Johnston & James Forrest & David Manley, 2015. "Modelling the occupational assimilation of immigrants by ancestry, age group and generational differences in Australia: a random effects approach to a large table of counts," Quality & Quantity: International Journal of Methodology, Springer, vol. 49(6), pages 2595-2615, November.
Schneider, Matthew J. & Jagpal, Sharan & Gupta, Sachin & Li, Shaobo & Yu, Yan, 2017. "Protecting customer privacy when marketing with second-party data," International Journal of Research in Marketing, Elsevier, vol. 34(3), pages 593-603.
Andrew Ainslie & Xavier Drèze & Fred Zufryden, 2005. "Modeling Movie Life Cycles and Market Share," Marketing Science, INFORMS, vol. 24(3), pages 508-517, November.
Ron Borzekowski & Raphael Thomadsen & Charles Taragin, 2009. "Competition and price discrimination in the market for mailing lists," Quantitative Marketing and Economics (QME), Springer, vol. 7(2), pages 147-179, June.
- Ron Borzekowski & Charles Taragin & Raphael Thomadsen, 2005. "Competition and price discrimination in the market for mailing lists," Finance and Economics Discussion Series 2005-56, Board of Governors of the Federal Reserve System (U.S.).
Jeonghye Choi & David R. Bell & Leonard M. Lodish, 2012. "Traditional and IS-Enabled Customer Acquisition on the Internet," Management Science, INFORMS, vol. 58(4), pages 754-769, April.
Sinha, Shameek & Malik, Sumit & Mahajan, Vijay & ter Hofstede, Frenkel, 2025. "Retain, reactivate or acquire: Can nonprofits reliably use community profiles as an alternative to past donation data?," Journal of Business Research, Elsevier, vol. 186(C).
M. Ballings & D. Van Den Poel, 2012. "The Relevant Length of Customer Event History for Churn Prediction: How long is long enough?," Working Papers of Faculty of Economics and Business Administration, Ghent University, Belgium 12/804, Ghent University, Faculty of Economics and Business Administration.
Steven M. Shugan, 2003. "Editorial: Compartmentalized Reviews and Other Initiatives: Should Marketing Scientists Review Manuscripts in Consumer Behavior?," Marketing Science, INFORMS, vol. 22(2), pages 151-160.
Piyush Anand & Clarence Lee, 2023. "Using Deep Learning to Overcome Privacy and Scalability Issues in Customer Data Transfer," Marketing Science, INFORMS, vol. 42(1), pages 189-207, January.
P. Baecke & D. Van Den Poel, 2012. "Improving Customer Acquisition Models by Incorporating Spatial Autocorrelation at Different Levels of Granularity," Working Papers of Faculty of Economics and Business Administration, Ghent University, Belgium 12/819, Ghent University, Faculty of Economics and Business Administration.

Most related items

These are the items that most often cite the same works as this one and are cited by the same works as this one.

Tat Chan & Naser Hamdi & Xiang Hui & Zhenling Jiang, 2022. "The Value of Verified Employment Data for Consumer Lending: Evidence from Equifax," Marketing Science, INFORMS, vol. 41(4), pages 795-814, July.
Goic, Marcel & Rojas, Andrea & Saavedra, Ignacio, 2021. "The Effectiveness of Triggered Email Marketing in Addressing Browse Abandonments," Journal of Interactive Marketing, Elsevier, vol. 55(C), pages 118-145.
Roland T. Rust & Tuck Siong Chung, 2006. "Marketing Models of Service and Relationships," Marketing Science, INFORMS, vol. 25(6), pages 560-580, 11-12.
Yuxin Chen & Chakravarthi Narasimhan & Z. John Zhang, 2001. "Individual Marketing with Imperfect Targetability," Marketing Science, INFORMS, vol. 20(1), pages 23-41, November.
Bose, Indranil & Chen, Xi, 2009. "Quantitative models for direct marketing: A review from systems perspective," European Journal of Operational Research, Elsevier, vol. 195(1), pages 1-16, May.
Verhoef, Peter C. & Venkatesan, Rajkumar & McAlister, Leigh & Malthouse, Edward C. & Krafft, Manfred & Ganesan, Shankar, 2010. "CRM in Data-Rich Multichannel Retailing Environments: A Review and Future Research Directions," Journal of Interactive Marketing, Elsevier, vol. 24(2), pages 121-137.
Romana Khan & Michael Lewis & Vishal Singh, 2009. "Dynamic Customer Management and the Value of One-to-One Marketing," Marketing Science, INFORMS, vol. 28(6), pages 1063-1079, 11-12.
Dimitris Bertsimas & Adam J. Mersereau, 2007. "A Learning Approach for Interactive Marketing to a Customer Segment," Operations Research, INFORMS, vol. 55(6), pages 1120-1135, December.
Dost, Florian & Wilken, Robert & Eisenbeiss, Maik & Skiera, Bernd, 2014. "On the Edge of Buying: A Targeting Approach for Indecisive Buyers Based on Willingness-to-Pay Ranges," Journal of Retailing, Elsevier, vol. 90(3), pages 393-407.
Alan L. Montgomery, 2001. "Applying Quantitative Marketing Techniques to the Internet," Interfaces, INFORMS, vol. 31(2), pages 90-108, April.
A. Prinzie & D. Van Den Poel, 2005. "Constrained optimization of data-mining problems to improve model performance: A direct-marketing application," Working Papers of Faculty of Economics and Business Administration, Ghent University, Belgium 05/298, Ghent University, Faculty of Economics and Business Administration.
Esther Gal-Or & Mordechai Gal-Or, 2005. "Customized Advertising via a Common Media Distributor," Marketing Science, INFORMS, vol. 24(2), pages 241-253, July.
Roland T. Rust & Peter C. Verhoef, 2005. "Optimizing the Marketing Interventions Mix in Intermediate-Term CRM," Marketing Science, INFORMS, vol. 24(3), pages 477-489, December.
David R. Bell & Jeongwen Chiang & V. Padmanabhan, 1999. "The Decomposition of Promotional Response: An Empirical Generalization," Marketing Science, INFORMS, vol. 18(4), pages 504-526.
Bond, Craig A. & Thilmany, Dawn D. & Bond, Jennifer Keeling, 2008. "What to Choose? The Value of Label Claims to Fresh Produce Consumers," Journal of Agricultural and Resource Economics, Western Agricultural Economics Association, vol. 33(3), pages 1-26.
Leenheer, J. & Bijmolt, T.H.A. & van Heerde, H.J. & Smidts, A., 2002. "Do Loyalty Programs Enhance Behavioral Loyalty : An Empirical Analysis Accounting for Program Design and Competitive Effects," Discussion Paper 2002-65, Tilburg University, Center for Economic Research.
Martinovici, A., 2019. "Revealing attention - how eye movements predict brand choice and moment of choice," Other publications TiSEM 7dca38a5-9f78-4aee-bd81-c, Tilburg University, School of Economics and Management.
Nevo, Aviv, 2001. "Measuring Market Power in the Ready-to-Eat Cereal Industry," Econometrica, Econometric Society, vol. 69(2), pages 307-342, March.
- Nevo, Aviv, 1998. "Measuring Market Power in the Ready-To-Eat Cereal Industry," Research Reports 25164, University of Connecticut, Food Marketing Policy Center.
- Nevo, Aviv, 1999. "Measuring Market Power in the Ready-to-Eat Cereal Industry," Competition Policy Center, Working Paper Series qt7cm5p858, Competition Policy Center, Institute for Business and Economic Research, UC Berkeley.
- Aviv Nevo, 2003. "Measuring Market Power in the Ready-to-Eat Cereal Industry," Microeconomics 0303006, University Library of Munich, Germany.
- Aviv Nevo, 1998. "Measuring Market Power in the Ready-to-Eat Cereal Industry," NBER Working Papers 6387, National Bureau of Economic Research, Inc.
- Nevo, Aviv, 1998. "Measuring Market Power in the Ready-To-Eat Cereal Industry," Food Marketing Policy Center Research Reports 037, University of Connecticut, Department of Agricultural and Resource Economics, Charles J. Zwick Center for Food and Resource Policy.
Kopalle, Praveen K. & Pauwels, Koen & Akella, Laxminarayana Yashaswy & Gangwar, Manish, 2023. "Dynamic pricing: Definition, implications for managers, and future research directions," Journal of Retailing, Elsevier, vol. 99(4), pages 580-593.
Dan Horsky & Sanjog Misra & Paul Nelson, 2006. "Observed and Unobserved Preference Heterogeneity in Brand-Choice Models," Marketing Science, INFORMS, vol. 25(4), pages 322-335, 07-08.

More about this item

Keywords

; ; ; ; ;

Statistics

Access and download statistics

Corrections

All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:inm:ormksc:v:22:y:2003:i:1:p:40-57. See general information about how to correct material in RePEc.

If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Chris Asher (email available below). General contact details of provider: https://edirc.repec.org/data/inforea.html .

Please note that corrections may take a couple of weeks to filter through the various RePEc services.

IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.

Browse Econ Literature

More features

Massively Categorical Variables: Revealing the Information in Zip Codes

Author

Abstract

Suggested Citation

Download full text from publisher

References listed on IDEAS

Citations

Most related items

More about this item

Keywords

Statistics

Corrections

More services and features

MyIDEAS

Author registration

Rankings

RePEc Genealogy

RePEc Biblio

MPRA

New papers by email

EconAcademics

Plagiarism

About RePEc

RePEc home

Blog

Help/FAQ

RePEc team

Participating archives

Privacy statement

Help us

Corrections

Volunteers

Get papers listed

Open a RePEc archive

Get RePEc data