IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2505.11599.html
   My bibliography  Save this paper

Can LLMs Credibly Transform the Creation of Panel Data from Diverse Historical Tables?

Author

Listed:
  • Ver'onica Backer-Peral
  • Vitaly Meursault
  • Christopher Severen

Abstract

Multimodal LLMs offer a watershed change for the digitization of historical tables, enabling low-cost processing centered on domain expertise rather than technical skills. We rigorously validate an LLM-based pipeline on a new panel of historical county-level vehicle registrations. This pipeline is 100 times less expensive than outsourcing, reduces critical parsing errors from 40% to 0.3%, and matches human-validated gold standard data with an $R^2$ of 98.6%. Analyses of growth and persistence in vehicle adoption are statistically indistinguishable whether using LLM or gold standard data. LLM-based digitization unlocks complex historical tables, enabling new economic analyses and broader researcher participation.

Suggested Citation

  • Ver'onica Backer-Peral & Vitaly Meursault & Christopher Severen, 2025. "Can LLMs Credibly Transform the Creation of Panel Data from Diverse Historical Tables?," Papers 2505.11599, arXiv.org.
  • Handle: RePEc:arx:papers:2505.11599
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2505.11599
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Richard Hornbeck, 2012. "The Enduring Impact of the American Dust Bowl: Short- and Long-Run Adjustments to Environmental Catastrophe," American Economic Review, American Economic Association, vol. 102(4), pages 1477-1507, June.
    2. Olmstead, Alan L. & Rhode, Paul W., 2001. "Reshaping The Landscape: The Impact And Diffusion Of The Tractor In American Agriculture, 1910–1960," The Journal of Economic History, Cambridge University Press, vol. 61(3), pages 663-698, September.
    3. Rodolfo E. Manuelli & Ananth Seshadri, 2014. "Frictionless Technology Diffusion: The Case of Tractors," American Economic Review, American Economic Association, vol. 104(4), pages 1368-1391, April.
    4. Antonio M. Bento & Maureen L. Cropper & Ahmed Mushfiq Mobarak & Katja Vinha, 2005. "The Effects of Urban Spatial Structure on Travel Demand in the United States," The Review of Economics and Statistics, MIT Press, vol. 87(3), pages 466-478, August.
    5. Christopher Severen & Arthur A. van Benthem, 2022. "Formative Experiences and the Price of Gasoline," American Economic Journal: Applied Economics, American Economic Association, vol. 14(2), pages 256-284, April.
    6. Gilles Duranton & Diego Puga, 2020. "The Economics of Urban Density," Journal of Economic Perspectives, American Economic Association, vol. 34(3), pages 3-26, Summer.
    7. Emily Silcock & Abhishek Arora & Luca D'Amico-Wong & Melissa Dell, 2024. "Newswire: A Large-Scale Structured Database of a Century of Historical News," Papers 2406.09490, arXiv.org.
    8. Hoyt Bleakley & Jeffrey Lin, 2012. "Portage and Path Dependence," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 127(2), pages 587-644.
    9. Laura Battaglia & Timothy M. Christensen & Stephen Hansen & Szymon Sacher, 2024. "Inference for regression with variables generated from unstructured data," CeMMAP working papers 10/24, Institute for Fiscal Studies.
    10. Millimet, Daniel L. & Bellemare, Marc, 2023. "Fixed Effects and Causal Inference," IZA Discussion Papers 16202, Institute of Labor Economics (IZA).
    11. Jiwon Choi & Ilyana Kuziemko & Ebonya Washington & Gavin Wright, 2024. "Local Economic and Political Effects of Trade Deals: Evidence from NAFTA," American Economic Review, American Economic Association, vol. 114(6), pages 1540-1575, June.
    12. Leah Brooks & Byron Lutz, 2019. "Vestiges of Transit: Urban Persistence at a Microscale," The Review of Economics and Statistics, MIT Press, vol. 101(3), pages 385-399, July.
    13. Duranton, Gilles & Turner, Matthew A., 2018. "Urban form and driving: Evidence from US cities," Journal of Urban Economics, Elsevier, vol. 108(C), pages 170-191.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Ahlfeldt, Gabriel M. & Pietrostefani, Elisabetta, 2019. "The economic effects of density: A synthesis," Journal of Urban Economics, Elsevier, vol. 111(C), pages 93-107.
    2. Lewis, Joshua & Severnini, Edson, 2020. "Short- and long-run impacts of rural electrification: Evidence from the historical rollout of the U.S. power grid," Journal of Development Economics, Elsevier, vol. 143(C).
    3. Duranton, Gilles & Puga, Diego, 2015. "Urban Land Use," Handbook of Regional and Urban Economics, in: Gilles Duranton & J. V. Henderson & William C. Strange (ed.), Handbook of Regional and Urban Economics, edition 1, volume 5, chapter 0, pages 467-560, Elsevier.
    4. Chen, Shuo & Lan, Xiaohuan, 2020. "Tractor vs. animal: Rural reforms and technology adoption in China," Journal of Development Economics, Elsevier, vol. 147(C).
    5. Venables, Anthony & Duranton, Gilles, 2018. "Place-Based Policies for Development," CEPR Discussion Papers 12889, C.E.P.R. Discussion Papers.
    6. Felipe Carozzi & Sandro Provenzano & Sefi Roth, 2024. "Urban density and COVID-19: understanding the US experience," The Annals of Regional Science, Springer;Western Regional Science Association, vol. 72(1), pages 163-194, January.
    7. Combes, Pierre-Philippe & Gobillon, Laurent & Zylberberg, Yanos, 2022. "Urban economics in a historical perspective: Recovering data with machine learning," Regional Science and Urban Economics, Elsevier, vol. 94(C).
    8. Kyle C. Meng, 2016. "Estimating Path Dependence in Energy Transitions," NBER Working Papers 22536, National Bureau of Economic Research, Inc.
    9. Morris A Davis & Andra C Ghent & Jesse Gregory, 2024. "The Work-From-Home Technology Boon and its Consequences," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 91(6), pages 3362-3401.
    10. Huang, Robert & Kahn, Matthew E., 2024. "An economic analysis of United States public transit carbon emissions dynamics," Regional Science and Urban Economics, Elsevier, vol. 107(C).
    11. Brad R. Humphreys & Geoffrey Propheter, "undated". "NFL Franchise Departures and Nearby Home Prices," Working Papers 24-05, Department of Economics, West Virginia University.
    12. Agrawal, David R. & Trandel, Gregory A., 2019. "Dynamics of policy adoption with state dependence," Regional Science and Urban Economics, Elsevier, vol. 79(C).
    13. Danny McGowan & Chrysovalantis Vasilakis, 2015. "Reap What You Sow: Agricultural Productivity, Structural Change and Urbanization," LIDAM Discussion Papers IRES 2015019, Université catholique de Louvain, Institut de Recherches Economiques et Sociales (IRES).
    14. Donn L Feir & Rob Gillezeau & Maggie E C Jones, 2024. "The Slaughter of the Bison and Reversal of Fortunes on the Great Plains," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 91(3), pages 1634-1670.
    15. James Fenske & Namrata Kala, 2012. "Climate, ecosystem resilience and the slave trade," CSAE Working Paper Series 2012-23, Centre for the Study of African Economies, University of Oxford.
    16. Severnini, Edson, 2014. "The Power of Hydroelectric Dams: Agglomeration Spillovers," IZA Discussion Papers 8082, Institute of Labor Economics (IZA).
    17. Accetturo, Antonio & Cascarano, Michele & de Blasio, Guido, 2024. "Pirate attacks and the shape of the Italian urban system," Regional Science and Urban Economics, Elsevier, vol. 108(C).
    18. Ostermeijer, Francis & Koster, Hans RA. & van Ommeren, Jos, 2019. "Residential parking costs and car ownership: Implications for parking policy and automated vehicles," Regional Science and Urban Economics, Elsevier, vol. 77(C), pages 276-288.
    19. Neeraj G Baruah & J Vernon Henderson & Cong Peng, 2021. "Colonial legacies: Shaping African cities," Journal of Economic Geography, Oxford University Press, vol. 21(1), pages 29-65.
    20. Blaudin de Thé, Camille & Carantino, Benjamin & Lafourcade, Miren, 2021. "The carbon ‘carprint’ of urbanization: New evidence from French cities," Regional Science and Urban Economics, Elsevier, vol. 89(C).

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2505.11599. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.