IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2601.03469.html

Content vs. Form: What Drives the Writing Score Gap Across Socioeconomic Backgrounds? A Generated Panel Approach

Author

Listed:
  • Nadav Kunievsky
  • Pedro Pertusi

Abstract

Students from different socioeconomic backgrounds exhibit persistent gaps in test scores, gaps that can translate into unequal educational and labor-market outcomes later in life. In many assessments, performance reflects not only what students know, but also how effectively they can communicate that knowledge. This distinction is especially salient in writing assessments, where scores jointly reward the substance of students' ideas and the way those ideas are expressed. As a result, observed score gaps may conflate differences in underlying content with differences in expressive skill. A central question, therefore, is how much of the socioeconomic-status (SES) gap in scores is driven by differences in what students say versus how they say it. We study this question using a large corpus of persuasive essays written by U.S. middle- and high-school students. We introduce a new measurement strategy that separates content from style by leveraging large language models to generate multiple stylistic variants of each essay. These rewrites preserve the underlying arguments while systematically altering surface expression, creating a "generated panel" that introduces controlled within-essay variation in style. This approach allows us to decompose SES gaps in writing scores into contributions from content and style. We find an SES gap of 0.67 points on a 1-6 scale. Approximately 69% of the gap is attributable to differences in essay content quality, Style differences account for 26% of the gap, and differences in evaluation standards across SES groups account for the remaining 5%. These patterns seems stable across demographic subgroups and writing tasks. More broadly, our approach shows how large language models can be used to generate controlled variation in observational data, enabling researchers to isolate and quantify the contributions of otherwise entangled factors.

Suggested Citation

  • Nadav Kunievsky & Pedro Pertusi, 2026. "Content vs. Form: What Drives the Writing Score Gap Across Socioeconomic Backgrounds? A Generated Panel Approach," Papers 2601.03469, arXiv.org.
  • Handle: RePEc:arx:papers:2601.03469
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2601.03469
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Fortin, Nicole & Lemieux, Thomas & Firpo, Sergio, 2011. "Decomposition Methods in Economics," Handbook of Labor Economics, in: O. Ashenfelter & D. Card (ed.), Handbook of Labor Economics, edition 1, volume 4, chapter 1, pages 1-102, Elsevier.
    2. David Arnold & Will Dobbie & Peter Hull, 2022. "Measuring Racial Discrimination in Bail Decisions," American Economic Review, American Economic Association, vol. 112(9), pages 2992-3038, September.
    3. Jens Ludwig & Sendhil Mullainathan, 2024. "Machine Learning as a Tool for Hypothesis Generation," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 139(2), pages 751-827.
    4. J Aislinn Bohren & Peter Hull & Alex Imas, 2025. "Systemic Discrimination: Theory and Measurement," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 140(3), pages 1743-1799.
    5. Alan S. Blinder, 1973. "Wage Discrimination: Reduced Form and Structural Estimates," Journal of Human Resources, University of Wisconsin Press, vol. 8(4), pages 436-455.
    6. Oaxaca, Ronald, 1973. "Male-Female Wage Differentials in Urban Labor Markets," International Economic Review, Department of Economics, University of Pennsylvania and Osaka University Institute of Social and Economic Research Association, vol. 14(3), pages 693-709, October.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Gagnon, Nickolas & Nosenzo, Daniele, 2025. "Discrimination Preferences," EconStor Preprints 323979, ZBW - Leibniz Information Centre for Economics.
    2. Thomas Y. Mathä & Alessandro Porpiglia & Michael Ziegelmeyer, 2014. "Wealth differences across borders and the effect of real estate price dynamics: Evidence from two household surveys," BCL working papers 90, Central Bank of Luxembourg.
    3. Valentine Fays & Benoît Mahy & François Rycx, 2023. "Wage differences according to workers' origin: The role of working more upstream in GVCs," LABOUR, CEIS, vol. 37(2), pages 319-342, June.
    4. Töpfer, Marina, 2017. "Detailed RIF decomposition with selection: The gender pay gap in Italy," Hohenheim Discussion Papers in Business, Economics and Social Sciences 26-2017, University of Hohenheim, Faculty of Business, Economics and Social Sciences.
    5. Katie Meara & Francesco Pastore & Allan Webster, 2020. "The gender pay gap in the USA: a matching study," Journal of Population Economics, Springer;European Society for Population Economics, vol. 33(1), pages 271-305, January.
    6. Sergio Longobardi & Margherita Maria Pagliuca & Andrea Regoli, 2018. "Can problem-solving attitudes explain the gender gap in financial literacy? Evidence from Italian students’ data," Quality & Quantity: International Journal of Methodology, Springer, vol. 52(4), pages 1677-1705, July.
    7. Kilic, Talip & Palacios-López, Amparo & Goldstein, Markus, 2015. "Caught in a Productivity Trap: A Distributional Perspective on Gender Differences in Malawian Agriculture," World Development, Elsevier, vol. 70(C), pages 416-463.
    8. Grégory Verdugo & Henry Fraisse & Guillaume Horny, 2012. "Changes In Wage Inequality In France: The Impact Of Composition Effects (in French)," Working papers 370, Banque de France.
    9. James Cloyne & Òscar Jordà & Alan M. Taylor, 2020. "Decomposing the Fiscal Multiplier," Working Paper Series 2020-12, Federal Reserve Bank of San Francisco.
    10. John Ariza & Gabriel Montes-Rojas, 2019. "Decomposition methods for analyzing inequality changes in Latin America 2002–2014," Empirical Economics, Springer, vol. 57(6), pages 2043-2078, December.
    11. Azam Mehtabul & Han Luyi, 2020. "Accounting for Differences in Female Labor Force Participation between China and India," The B.E. Journal of Economic Analysis & Policy, De Gruyter, vol. 20(2), pages 1-17, April.
    12. Jean-Marc Fournier & Isabell Koske, 2012. "The determinants of earnings inequality: evidence from quantile regressions," OECD Journal: Economic Studies, OECD Publishing, vol. 2012(1), pages 7-36.
    13. Wazah Pello-Esso & Ulf Gerdtham & Sara Larsson Lönn & Jan Sundquist & Kristina Sundquist, 2025. "Immigrant-Native Wage Gap in Sweden: Do Personality Traits Matter?," Journal of International Migration and Integration, Springer, vol. 26(1), pages 467-489, March.
    14. Michaela Fuchs & Anja Rossen & Antje Weyh & Gabriele Wydra‐Somaggio, 2021. "Where do women earn more than men? Explaining regional differences in the gender pay gap," Journal of Regional Science, Wiley Blackwell, vol. 61(5), pages 1065-1086, November.
    15. Tymon Słoczyński, 2022. "Interpreting OLS Estimands When Treatment Effects Are Heterogeneous: Smaller Groups Get Larger Weights," The Review of Economics and Statistics, MIT Press, vol. 104(3), pages 501-509, May.
    16. Ramos, Raul & Sanromá, Esteban & Simón, Hipólito, 2022. "Collective bargaining levels, employment and wage inequality in Spain," Journal of Policy Modeling, Elsevier, vol. 44(2), pages 375-395.
    17. Mariusz Kaszubowski & Joanna Wolszczak-Derlacz, 2014. "Salary and reservation wage gender gaps in Polish academia," GUT FME Working Paper Series A 19, Faculty of Management and Economics, Gdansk University of Technology.
    18. Gabriel Montes-Rojas & Lucas Siga & Ram Mainali, 2017. "Mean and quantile regression Oaxaca-Blinder decompositions with an application to caste discrimination," The Journal of Economic Inequality, Springer;Society for the Study of Economic Inequality, vol. 15(3), pages 245-255, September.
    19. Luis Ayala & Javier Martín‐Román & Juan Vicente, 2024. "What contributes to rising inequality in large cities?," Journal of Regional Science, Wiley Blackwell, vol. 64(5), pages 1760-1810, November.
    20. Karmann, Alexander & Sugawara, Shinya, 2022. "Comparing the German and Japanese nursing home sectors: Implications of demographic and policy differences," CEPIE Working Papers 02/22, Technische Universität Dresden, Center of Public and International Economics (CEPIE).

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2601.03469. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.