IDEAS home Printed from https://ideas.repec.org/p/nbr/nberwo/22423.html
   My bibliography  Save this paper

Measuring Group Differences in High-Dimensional Choices: Method and Application to Congressional Speech

Author

Listed:
  • Matthew Gentzkow
  • Jesse M. Shapiro
  • Matt Taddy

Abstract

We study the problem of measuring group differences in choices when the dimensionality of the choice set is large. We show that standard approaches suffer from a severe finite-sample bias, and we propose an estimator that applies recent advances in machine learning to address this bias. We apply this method to measure trends in the partisanship of congressional speech from 1873 to 2016, defining partisanship to be the ease with which an observer could infer a congressperson’s party from a single utterance. Our estimates imply that partisanship is far greater in recent years than in the past, and that it increased sharply in the early 1990s after remaining low and relatively constant over the preceding century.

Suggested Citation

  • Matthew Gentzkow & Jesse M. Shapiro & Matt Taddy, 2016. "Measuring Group Differences in High-Dimensional Choices: Method and Application to Congressional Speech," NBER Working Papers 22423, National Bureau of Economic Research, Inc.
  • Handle: RePEc:nbr:nberwo:22423
    Note: POL
    as

    Download full text from publisher

    File URL: http://www.nber.org/papers/w22423.pdf
    Download Restriction: no
    ---><---

    Other versions of this item:

    References listed on IDEAS

    as
    1. Frankel, David M. & Volij, Oscar, 2011. "Measuring school segregation," Journal of Economic Theory, Elsevier, vol. 146(1), pages 1-38, January.
    2. Judith K. Hellerstein & David Neumark, 2008. "Workplace Segregation in the United States: Race, Ethnicity, and Skill," The Review of Economics and Statistics, MIT Press, vol. 90(3), pages 459-477, August.
    3. Ramon Caminal & Antonio Di Paolo, 2015. "Your language or mine?," Working Papers XREAP2015-05, Xarxa de Referència en Economia Aplicada (XREAP), revised Nov 2015.
    4. Grimmer, Justin, 2010. "A Bayesian Hierarchical Topic Model for Political Texts: Measuring Expressed Agendas in Senate Press Releases," Political Analysis, Cambridge University Press, vol. 18(1), pages 1-35, January.
    5. Coralio Ballester & Marc Vorsatz, 2014. "Random Walk-Based Segregation Measures," The Review of Economics and Statistics, MIT Press, vol. 96(3), pages 383-401, July.
    6. Roland G. Fryer & Steven D. Levitt, 2004. "The Causes and Consequences of Distinctively Black Names," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 119(3), pages 767-805.
    7. Jacob Jensen & Ethan Kaplan & Suresh Naidu & Laurence Wilse-Samson, 2012. "Political Polarization and the Dynamics of Political Language: Evidence from 130 Years of Partisan Speech," Brookings Papers on Economic Activity, Economic Studies Program, The Brookings Institution, vol. 43(2 (Fall)), pages 1-81.
    8. Shane Greenstein & Feng Zhu, 2012. "Is Wikipedia Biased?," American Economic Review, American Economic Association, vol. 102(3), pages 343-348, May.
    9. Rebecca Allen & Simon Burgess & Russell Davidson & Frank Windmeijer, 2015. "More reliable inference for the dissimilarity index of segregation," Econometrics Journal, Royal Economic Society, vol. 18(1), pages 40-66, February.
    10. Birney, Mayling & Graetz, Michael J. & Shapiro, Ian, 2006. "Public Opinion and the Push toRepeal the Estate Tax," National Tax Journal, National Tax Association;National Tax Journal, vol. 59(3), pages 439-461, September.
    11. Gregory J. Martin & Ali Yurukoglu, 2017. "Bias in Cable News: Persuasion and Polarization," American Economic Review, American Economic Association, vol. 107(9), pages 2565-2599, September.
    12. Edward L. Glaeser & Bryce A. Ward, 2006. "Myths and Realities of American Political Geography," Journal of Economic Perspectives, American Economic Association, vol. 20(2), pages 119-144, Spring.
    13. Mele, Angelo, 2013. "Poisson indices of segregation," Regional Science and Urban Economics, Elsevier, vol. 43(1), pages 65-85.
    14. David M. Cutler & Edward L. Glaeser & Jacob L. Vigdor, 1999. "The Rise and Decline of the American Ghetto," Journal of Political Economy, University of Chicago Press, vol. 107(3), pages 455-506, June.
    15. Santos Silva, J.M.C. & Tenreyro, Silvana, 2010. "On the existence of the maximum likelihood estimates in Poisson regression," Economics Letters, Elsevier, vol. 107(2), pages 310-312, May.
    16. Ellison, Glenn & Glaeser, Edward L, 1997. "Geographic Concentration in U.S. Manufacturing Industries: A Dartboard Approach," Journal of Political Economy, University of Chicago Press, vol. 105(5), pages 889-927, October.
    17. Fan J. & Li R., 2001. "Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties," Journal of the American Statistical Association, American Statistical Association, vol. 96, pages 1348-1360, December.
    18. Matthew Gentzkow & Jesse Shapiro & Matt Taddy, 2016. "Measuring Polarization in High-Dimensional Data: Method and Application to Congressional Speech," Working Papers id:11114, eSocialSciences.
    19. Jacob Jensen & Ethan Kaplan & Suresh Naidu & Laurence Wilse-Samson, 2012. "Political Polarization and the Dynamics of Political Language: Evidence from 130 Years of Partisan Speech," Brookings Papers on Economic Activity, Economic Studies Program, The Brookings Institution, vol. 45(2 (Fall)), pages 1-81.
    20. Kimberly Bayard & Judith Hellerstein & David Neumark & Kenneth Troske, 2003. "New Evidence on Sex Segregation and Sex Differences in Wages from Matched Employee-Employer Data," Journal of Labor Economics, University of Chicago Press, vol. 21(4), pages 887-922, October.
    21. Druckman, James N. & Peterson, Erik & Slothuus, Rune, 2013. "How Elite Partisan Polarization Affects Public Opinion Formation," American Political Science Review, Cambridge University Press, vol. 107(1), pages 57-79, February.
    22. Matthew Gentzkow & Jesse M. Shapiro, 2011. "Ideological Segregation Online and Offline," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 126(4), pages 1799-1839.
    23. Roland Rathelot, 2012. "Measuring Segregation When Units are Small: A Parametric Approach," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 30(4), pages 546-553, June.
    24. Laver, Michael & Benoit, Kenneth & Garry, John, 2003. "Extracting Policy Positions from Political Texts Using Words as Data," American Political Science Review, Cambridge University Press, vol. 97(2), pages 311-331, May.
    25. Chong, Dennis & Druckman, James N., 2007. "Framing Public Opinion in Competitive Democracies," American Political Science Review, Cambridge University Press, vol. 101(4), pages 637-655, November.
    26. Paul C. Tetlock, 2007. "Giving Content to Investor Sentiment: The Role of Media in the Stock Market," Journal of Finance, American Finance Association, vol. 62(3), pages 1139-1168, June.
    27. Werner Antweiler & Murray Z. Frank, 2004. "Is All That Talk Just Noise? The Information Content of Internet Stock Message Boards," Journal of Finance, American Finance Association, vol. 59(3), pages 1259-1294, June.
    28. Carrington, William J & Troske, Kenneth R, 1997. "On Measuring Segregation in Samples with Small Units," Journal of Business & Economic Statistics, American Statistical Association, vol. 15(4), pages 402-409, October.
    29. Matthew Gentzkow & Jesse M. Shapiro, 2010. "What Drives Media Slant? Evidence From U.S. Daily Newspapers," Econometrica, Econometric Society, vol. 78(1), pages 35-71, January.
    30. Federico Echenique & Roland G. Fryer, 2007. "A Measure of Segregation Based on Social Interactions," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 122(2), pages 441-485.
    31. M. Keith Chen, 2013. "The Effect of Language on Economic Behavior: Evidence from Savings Rates, Health Behaviors, and Retirement Assets," American Economic Review, American Economic Association, vol. 103(2), pages 690-731, April.
    32. Nelson, Thomas E. & Clawson, Rosalee A. & Oxley, Zoe M., 1997. "Media Framing of a Civil Liberties Conflict and Its Effect on Tolerance," American Political Science Review, Cambridge University Press, vol. 91(3), pages 567-583, September.
    33. Irma Clots‐Figueras & Paolo Masella, 2013. "Education, Language and Identity," Economic Journal, Royal Economic Society, vol. 0, pages 332-357, August.
    34. Jorge Alcalde-Unzu & Marc Vorsatz, 2013. "Measuring the cohesiveness of preferences: an axiomatic analysis," Social Choice and Welfare, Springer;The Society for Social Choice and Welfare, vol. 41(4), pages 965-988, October.
    35. Matt Taddy, 2013. "Multinomial Inverse Regression for Text Analysis," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 108(503), pages 755-770, September.
    36. Cheryl J. Flynn & Clifford M. Hurvich & Jeffrey S. Simonoff, 2013. "Efficiency for Regularization Parameter Selection in Penalized Likelihood Estimation of Misspecified Models," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 108(503), pages 1031-1043, September.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Matthew Gentzkow & Bryan T. Kelly & Matt Taddy, 2017. "Text as Data," NBER Working Papers 23276, National Bureau of Economic Research, Inc.
    2. Xavier D'Haultfœuille & Roland Rathelot, 2017. "Measuring segregation on small units: A partial identification analysis," Quantitative Economics, Econometric Society, vol. 8(1), pages 39-73, March.
    3. Gordon Anderson & Oliver Linton & Jasmin Thomas, 2017. "Similarity, dissimilarity and exceptionality: generalizing Gini’s transvariation to measure “differentness” in many distributions," METRON, Springer;Sapienza Università di Roma, vol. 75(2), pages 161-180, August.
    4. Caroline Le Pennec, 2020. "Strategic Campaign Communication: Evidence from 30,000 Candidate Manifestos," SoDa Laboratories Working Paper Series 2020-05, Monash University, SoDa Laboratories.
    5. Gordon Anderson, 2018. "Measuring Aspects of Mobility, Polarization and Convergence in the Absence of Cardinality: Indices Based Upon Transitional Typology," Social Indicators Research: An International and Interdisciplinary Journal for Quality-of-Life Measurement, Springer, vol. 139(3), pages 887-907, October.
    6. Glitz, Albrecht, 2014. "Ethnic segregation in Germany," Labour Economics, Elsevier, vol. 29(C), pages 28-40.
    7. Michalopoulos, Stelios & ,, 2019. "Folklore," CEPR Discussion Papers 13425, C.E.P.R. Discussion Papers.
    8. Shane Greenstein & Grace Gu & Feng Zhu, 2021. "Ideology and Composition Among an Online Crowd: Evidence from Wikipedians," Management Science, INFORMS, vol. 67(5), pages 3067-3086, May.
    9. Draca, Mirko & Schwarz, Carlo, 2019. "How Polarized are Citizens? Measuring Ideology from the Ground-Up," CAGE Online Working Paper Series 432, Competitive Advantage in the Global Economy (CAGE).
    10. Kutscher, Macarena & Nath, Shanjukta & Urzúa, Sergio, 2023. "Centralized admission systems and school segregation: Evidence from a national reform," Journal of Public Economics, Elsevier, vol. 221(C).
    11. Gavin Abercrombie & Riza Batista-Navarro, 2020. "Sentiment and position-taking analysis of parliamentary debates: a systematic literature review," Journal of Computational Social Science, Springer, vol. 3(1), pages 245-270, April.
    12. Kosnik, Lea-Rachel, 2015. "What have economists been doing for the last 50 years? A text analysis of published academic research from 1960-2010," Economics - The Open-Access, Open-Assessment E-Journal (2007-2020), Kiel Institute for the World Economy (IfW Kiel), vol. 9, pages 1-38.
    13. Kaiser, Ulrich & Kuhn, Johan M., 2020. "The value of publicly available, textual and non-textual information for startup performance prediction," Journal of Business Venturing Insights, Elsevier, vol. 14(C).
    14. Bruce Sacerdote & Ranjan Sehgal & Molly Cook, 2020. "Why Is All COVID-19 News Bad News?," NBER Working Papers 28110, National Bureau of Economic Research, Inc.
    15. Renan Xavier Cortes & Sergio Rey & Elijah Knaap & Levi John Wolf, 2020. "An open-source framework for non-spatial and spatial segregation measures: the PySAL segregation module," Journal of Computational Social Science, Springer, vol. 3(1), pages 135-166, April.
    16. Shane Greenstein & Yuan Gu & Feng Zhu, 2016. "Ideological Segregation among Online Collaborators: Evidence from Wikipedians," Harvard Business School Working Papers 17-028, Harvard Business School, revised Mar 2017.
    17. Lévêque, Christophe & Saleh, Mohamed, 2018. "Does industrialization affect segregation? Evidence from nineteenth-century Cairo," Explorations in Economic History, Elsevier, vol. 67(C), pages 40-61.
    18. Roberto Casarin & Flaminio Squazzoni, 2012. "Financial press and stock markets in times of crisis," Working Papers 2012_04, Department of Economics, University of Venice "Ca' Foscari".
    19. Francesco Andreoli & Claudio Zoli, 2015. "Measuring the interaction dimension of segregation: the Gini-Exposure index," Working Papers 30/2015, University of Verona, Department of Economics.
    20. Coral Río & Olga Alonso-Villar, 2022. "On Measuring Segregation in a Multigroup Context: Standardized Versus Unstandardized Indices," Social Indicators Research: An International and Interdisciplinary Journal for Quality-of-Life Measurement, Springer, vol. 163(2), pages 633-659, September.

    More about this item

    JEL classification:

    • D72 - Microeconomics - - Analysis of Collective Decision-Making - - - Political Processes: Rent-seeking, Lobbying, Elections, Legislatures, and Voting Behavior

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nbr:nberwo:22423. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: the person in charge (email available below). General contact details of provider: https://edirc.repec.org/data/nberrus.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.