Recovering Overlooked Information in Categorical Variables with LLMs: An Application to Labor Market Mismatch

My bibliography Save this paper

Recovering Overlooked Information in Categorical Variables with LLMs: An Application to Labor Market Mismatch

Author

Listed:

Yi Chen
(ShanghaiTech University)
Hanming Fang
(University of Pennsylvania)
Yi Zhao
(Tsinghua University)
Zibo Zhao
(ShanghaiTech University)

Registered:

Abstract

Categorical variables have no intrinsic ordering, and researchers often adopt a fixed-effect (FE) approach in empirical analysis. However, this approach has two significant limitations: it overlooks textual information associated with the categorical variables; and it produces unstable results when there are only limited observations in a category. In this paper, we propose a novel method that utilizes recent advances in large language models (LLMs) to recover overlooked information in categorical variables. We apply this method to investigate labor market mismatch. Specifically, we task LLMs with simulating the role of a human resources specialist to assess the suitability of an applicant with specific characteristics for a given job. Our main findings can be summarized in three parts. First, using comprehensive administrative data from an online job posting platform, we show that our new match quality measure is positively correlated with several traditional measures in the literature, and we highlight the LLM’s capability to provide additional information beyond that contained in the traditional measures. Second, we demonstrate the broad applicability of the new method with a survey data containing significantly less information than the administrative data, which makes it impossible to compute most of the traditional match quality measures. Our LLM measure successfully replicates most of the salient patterns observed in a hard-to-access administrative dataset using easily accessible survey data. Third, we investigate the gender gap in match quality and explore whether there exists gender stereotypes in the hiring process. We simulate an audit study, examining whether revealing gender information to LLMs influences their assessment. We show that when gender information is disclosed to the LLMs, the model deems females better suited for traditionally female-dominated roles.

Suggested Citation

Yi Chen & Hanming Fang & Yi Zhao & Zibo Zhao, 2024. "Recovering Overlooked Information in Categorical Variables with LLMs: An Application to Labor Market Mismatch," PIER Working Paper Archive 24-017, Penn Institute for Economic Research, Department of Economics, University of Pennsylvania.

Handle: RePEc:pen:papers:24-017

Download full text from publisher

References listed on IDEAS

Tyna Eloundou & Sam Manning & Pamela Mishkin & Daniel Rock, 2023. "GPTs are GPTs: An Early Look at the Labor Market Impact Potential of Large Language Models," Papers 2303.10130, arXiv.org, revised Aug 2023.

Full references (including those not matched with items on IDEAS)

Citations

Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.

Cited by:

Hanming Fang & Ming Li & Guangli Lu, 2025. "Decoding China’s Industrial Policies," PIER Working Paper Archive 25-012, Penn Institute for Economic Research, Department of Economics, University of Pennsylvania.
Herbert Dawid & Philipp Harting & Hankui Wang & Zhongli Wang & Jiachen Yi, 2025. "Agentic Workflows for Economic Research: Design and Implementation," Papers 2504.09736, arXiv.org.

Most related items

These are the items that most often cite the same works as this one and are cited by the same works as this one.

Christoph Riedl & Eric Bogert, 2024. "Effects of AI Feedback on Learning, the Skill Gap, and Intellectual Diversity," Papers 2409.18660, arXiv.org.
Carvajal, Daniel & Franco, Catalina & Isaksson, Siri, 2024. "Will Artificial Intelligence Get in the Way of Achieving Gender Equality?," Discussion Paper Series in Economics 3/2024, Norwegian School of Economics, Department of Economics, revised 28 Apr 2025.
Naomi Hausman & Oren Rigbi & Sarit Weisburd, 2025. "Generative AI’s Impact on Student Achievement and Implications for Worker Productivity," CESifo Working Paper Series 11843, CESifo.
Evangelos Katsamakas & Oleg V. Pavlov & Ryan Saklad, 2024. "Artificial intelligence and the transformation of higher education institutions," Papers 2402.08143, arXiv.org.
Yang Shen, 2024. "Future jobs: analyzing the impact of artificial intelligence on employment and its mechanisms," Economic Change and Restructuring, Springer, vol. 57(2), pages 1-33, April.
Jai Vipra & Anton Korinek, 2023. "Market Concentration Implications of Foundation Models," Papers 2311.01550, arXiv.org.
Kristina McElheran & J. Frank Li & Erik Brynjolfsson & Zachary Kroff & Emin Dinlersoz & Lucia Foster & Nikolas Zolas, 2024. "AI adoption in America: Who, what, and where," Journal of Economics & Management Strategy, Wiley Blackwell, vol. 33(2), pages 375-415, March.
- Kristina McElheran & J. Frank Li & Erik Brynjolfsson & Zachary Krof & Emin Dinlersoz & Lucia Foster & Nikolas Zolas, 2023. "AI Adoption in America: Who, What, and Where," Working Papers 23-48, Center for Economic Studies, U.S. Census Bureau.
- Kristina McElheran & J. Frank Li & Erik Brynjolfsson & Zachary Kroff & Emin Dinlersoz & Lucia S. Foster & Nikolas Zolas, 2023. "AI Adoption in America: Who, What, and Where," NBER Working Papers 31788, National Bureau of Economic Research, Inc.
Leonardo Banh & Gero Strobel, 2023. "Generative artificial intelligence," Electronic Markets, Springer;IIM University of St. Gallen, vol. 33(1), pages 1-17, December.
Draca, Mirko & Nathan, Max & Nguyen-Tien, Viet & Oliveira-Cunha, Juliana & Rosso, Anna & Valero, Anna, 2024. "The New Wave? The Role of Human Capital and STEM Skills in Technology Adoption in the UK," The Warwick Economics Research Paper Series (TWERPS) 1521, University of Warwick, Department of Economics.
- Mirko Draca & Max Nathan & Viet Nguyen-Tien & Juliana Oliveira-Cunha & Anna Rosso & Anna Valero, 2024. "The New Wave? The Role of Human Capital and STEM Skills in Technology Adoption in the UK," Development Working Papers 495, Centro Studi Luca d'Agliano, University of Milano.
- Draca, Mirko & Nathan, Max & Nguyen-Tien, Viet & Oliveira-Cunha, Juliana & Rosso, Anna & Valero, Anna, 2024. "The New Wave? The Role of Human Capital and STEM Skills in Technology Adoption in the UK," CAGE Online Working Paper Series 726, Competitive Advantage in the Global Economy (CAGE).
- Mirko Draca & Max Nathan & Viet Nguyen-Tien & Juliana Oliveira-Cunha & Anna Rosso & Anna Valero, 2024. "The new wave? The role of human capital and STEM skills in technology adoption in the UK," CEP Discussion Papers dp2040, Centre for Economic Performance, LSE.
- Draca, Mirko & Nathan, Max & Nguyen, Viet Nguyen-Tien & Oliveira Cunha, Juliana & Rosso, Anna & Sivropoulos-Valero, Anna Valero, 2024. "The new wave? The role of human capital and STEM skills in technology adoption in the UK," LSE Research Online Documents on Economics 126769, London School of Economics and Political Science, LSE Library.
- Draca, Mirko & Nathan, Max & Nguyen-Tien, Viet & Oliveira Cunha, Juliana & Rosso, Anna & Valero, Anna, 2024. "The new wave? The role of human capital and STEM skills in technology adoption in the UK," LSE Research Online Documents on Economics 127313, London School of Economics and Political Science, LSE Library.
- Draca, Mirko & Nathan, Max & Nguyen-Tien, Viet & Oliveira-Cunha, Juliana & Rosso, Anna & Valero, Anna, 2024. "The New Wave? The Role of Human Capital and STEM Skills in Technology Adoption in the UK," IZA Discussion Papers 17329, Institute of Labor Economics (IZA).
Berlinski, Elise & Morales, Jérémy & Sponem, Samuel, 2024. "Artificial imaginaries: Generative AIs as an advanced form of capitalism," CRITICAL PERSPECTIVES ON ACCOUNTING, Elsevier, vol. 99(C).
Lan Chen & Yufei Ji & Xichen Yao & Hengshu Zhu, 2024. "Occupation Life Cycle," Papers 2406.15373, arXiv.org.
Acar, Oguz A., 2024. "Commentary: Reimagining marketing education in the age of generative AI," International Journal of Research in Marketing, Elsevier, vol. 41(3), pages 489-495.
Frank M. Fossen & Trevor McLemore & Alina Sorgner, 2024. "Artificial Intelligence and Entrepreneurship," Foundations and Trends(R) in Entrepreneurship, now publishers, vol. 20(8), pages 781-904, December.
- Fossen, Frank M. & McLemore, Trevor & Sorgner, Alina, 2024. "Artificial Intelligence and Entrepreneurship," IZA Discussion Papers 17055, Institute of Labor Economics (IZA).
Caleb Peppiatt, 2024. "The Future of Work: Inequality, Artificial Intelligence, and What Can Be Done About It. A Literature Review," Papers 2408.13300, arXiv.org.
D'Al, Francesco & Santarelli, Enrico & Vivarelli, Marco, 2024. "The KSTE+I approach and the advent of AI technologies: evidence from the European regions," GLO Discussion Paper Series 1473, Global Labor Organization (GLO).
Amali Matharaarachchi & Wishmitha Mendis & Kanishka Randunu & Daswin De Silva & Gihan Gamage & Harsha Moraliyage & Nishan Mills & Andrew Jennings, 2024. "Optimizing Generative AI Chatbots for Net-Zero Emissions Energy Internet-of-Things Infrastructure," Energies, MDPI, vol. 17(8), pages 1-19, April.
Anna Davies & Betsy Donald & Mia Gray, 2023. "The power of platforms—precarity and place," Cambridge Journal of Regions, Economy and Society, Cambridge Political Economy Society, vol. 16(2), pages 245-256.
Carlo Drago & Alberto Costantiello & Marco Savorgnan & Angelo Leogrande, 2025. "Driving AI Adoption in the EU: A Quantitative Analysis of Macroeconomic Influences," Working Papers hal-05102974, HAL.
- Drago, Carlo & Costantiello, Alberto & Savorgnan, Marco & Leogrande, Angelo, 2025. "Driving AI Adoption in the EU: A Quantitative Analysis of Macroeconomic Influences," MPRA Paper 124973, University Library of Munich, Germany.
D'Allesandro, Francesco & Santarelli, Enrico & Vivarelli, Marco, 2024. "The KSTE+I approach and the AI technologies," MERIT Working Papers 2024-016, United Nations University - Maastricht Economic and Social Research Institute on Innovation and Technology (MERIT).
- Francesco D'Alessandro & Enrico Santarelli & Marco Vivarelli, 2024. "The KSTE+I approach and the AI technologies," DISCE - Working Papers del Dipartimento di Politica Economica dipe0039, Università Cattolica del Sacro Cuore, Dipartimenti e Istituti di Scienze Economiche (DISCE).
Ylenia Curci & Nathalie Greenan & Silvia Napolitano, 2024. "Innovating for the good or for the bad. An EU-wide analysis of the impact of technological transformation on job polarisation and unemployment," TEPP Working Paper 2024-02, TEPP.

More about this item

Keywords

; ; ;

JEL classification:

C55 - Mathematical and Quantitative Methods - - Econometric Modeling - - - Large Data Sets: Modeling and Analysis
J16 - Labor and Demographic Economics - - Demographic Economics - - - Economics of Gender; Non-labor Discrimination
J24 - Labor and Demographic Economics - - Demand and Supply of Labor - - - Human Capital; Skills; Occupational Choice; Labor Productivity
J31 - Labor and Demographic Economics - - Wages, Compensation, and Labor Costs - - - Wage Level and Structure; Wage Differentials

NEP fields

This paper has been announced in the following NEP Reports:

NEP-AIN-2024-08-19 (Artificial Intelligence)
NEP-BIG-2024-08-19 (Big Data)
NEP-LMA-2024-08-19 (Labor Markets - Supply, Demand, and Wages)

Statistics

Access and download statistics

Corrections

All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:pen:papers:24-017. See general information about how to correct material in RePEc.

If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Administrator (email available below). General contact details of provider: https://edirc.repec.org/data/deupaus.html .

Please note that corrections may take a couple of weeks to filter through the various RePEc services.

IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.

Browse Econ Literature

More features

Recovering Overlooked Information in Categorical Variables with LLMs: An Application to Labor Market Mismatch

Author

Abstract

Suggested Citation

Download full text from publisher

References listed on IDEAS

Citations

Most related items

More about this item

Keywords

JEL classification:

NEP fields

Statistics

Corrections

More services and features

MyIDEAS

Author registration

Rankings

RePEc Genealogy

RePEc Biblio

MPRA

New papers by email

EconAcademics

Plagiarism

About RePEc

RePEc home

Blog

Help/FAQ

RePEc team

Participating archives

Privacy statement

Help us

Corrections

Volunteers

Get papers listed

Open a RePEc archive

Get RePEc data