IDEAS home Printed from https://ideas.repec.org/p/fip/fedpwp/101850.html

Can LLMs Credibly Transform the Creation of Panel Data from Diverse Historical Tables

Author

Abstract

Multimodal LLMs offer a watershed change for the digitization of historical tables, enabling low-cost processing centered on domain expertise rather than technical skills. We rigorously validate an LLM-based pipeline on a new panel of historical county-level vehicle registrations. This pipeline is estimated to be 100 times less expensive than outsourcing options, reduces critical parsing errors from 40% to 0.3%, and matches human-validated gold standard data with an R2 of 98.6%. Analyses of growth and persistence in vehicle adoption are statistically indistinguishable whether using LLM or gold standard data. LLM-based digitization unlocks complex historical tables, enabling new economic analyses and broader researcher participation.

Suggested Citation

  • Verónica Bäcker-Peral & Vitaly Meursault & Christopher Severen, 2025. "Can LLMs Credibly Transform the Creation of Panel Data from Diverse Historical Tables," Working Papers 25-28, Federal Reserve Bank of Philadelphia.
  • Handle: RePEc:fip:fedpwp:101850
    DOI: 10.21799/frbp.wp.2025.28
    as

    Download full text from publisher

    File URL: https://www.philadelphiafed.org/-/media/FRBP/Assets/working-papers/2025/wp25-28.pdf
    Download Restriction: no

    File URL: https://libkey.io/10.21799/frbp.wp.2025.28?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    Other versions of this item:

    References listed on IDEAS

    as
    1. Richard Hornbeck, 2012. "The Enduring Impact of the American Dust Bowl: Short- and Long-Run Adjustments to Environmental Catastrophe," American Economic Review, American Economic Association, vol. 102(4), pages 1477-1507, June.
    2. Olmstead, Alan L. & Rhode, Paul W., 2001. "Reshaping The Landscape: The Impact And Diffusion Of The Tractor In American Agriculture, 1910–1960," The Journal of Economic History, Cambridge University Press, vol. 61(3), pages 663-698, September.
    3. Reid Ewing & Robert Cervero, 2010. "Travel and the Built Environment," Journal of the American Planning Association, Taylor & Francis Journals, vol. 76(3), pages 265-294.
    4. Rodolfo E. Manuelli & Ananth Seshadri, 2014. "Frictionless Technology Diffusion: The Case of Tractors," American Economic Review, American Economic Association, vol. 104(4), pages 1368-1391, April.
    5. Antonio M. Bento & Maureen L. Cropper & Ahmed Mushfiq Mobarak & Katja Vinha, 2005. "The Effects of Urban Spatial Structure on Travel Demand in the United States," The Review of Economics and Statistics, MIT Press, vol. 87(3), pages 466-478, August.
    6. Christopher Severen & Arthur A. van Benthem, 2022. "Formative Experiences and the Price of Gasoline," American Economic Journal: Applied Economics, American Economic Association, vol. 14(2), pages 256-284, April.
    7. Gilles Duranton & Diego Puga, 2020. "The Economics of Urban Density," Journal of Economic Perspectives, American Economic Association, vol. 34(3), pages 3-26, Summer.
    8. Emily Silcock & Abhishek Arora & Luca D'Amico-Wong & Melissa Dell, 2024. "Newswire: A Large-Scale Structured Database of a Century of Historical News," Papers 2406.09490, arXiv.org.
    9. Millimet, Daniel L. & Bellemare, Marc, 2023. "Fixed Effects and Causal Inference," IZA Discussion Papers 16202, IZA Network @ LISER.
    10. Jiwon Choi & Ilyana Kuziemko & Ebonya Washington & Gavin Wright, 2024. "Local Economic and Political Effects of Trade Deals: Evidence from NAFTA," American Economic Review, American Economic Association, vol. 114(6), pages 1540-1575, June.
    11. Laura Battaglia & Timothy Christensen & Stephen Hansen & Szymon Sacher, 2024. "Inference for Regression with Variables Generated from Unstructured Data," CESifo Working Paper Series 11119, CESifo.
    12. Leah Brooks & Byron Lutz, 2019. "Vestiges of Transit: Urban Persistence at a Microscale," The Review of Economics and Statistics, MIT Press, vol. 101(3), pages 385-399, July.
    13. Hoyt Bleakley & Jeffrey Lin, 2012. "Portage and Path Dependence," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 127(2), pages 587-644.
    14. Duranton, Gilles & Turner, Matthew A., 2018. "Urban form and driving: Evidence from US cities," Journal of Urban Economics, Elsevier, vol. 108(C), pages 170-191.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Niclas Griesshaber & Jochen Streb, 2025. "Multimodal LLMs for Historical Dataset Construction from Archival Image Scans: German Patents (1877-1918)," Papers 2512.19675, arXiv.org.
    2. Griesshaber, Niclas & Streb, Jochen, 2026. "Multimodal LLMs for historical dataset construction from archival image scans: German patents (1877-1918)," SAFE Working Paper Series 466, Leibniz Institute for Financial Research SAFE.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Ostermeijer, Francis & Koster, Hans RA. & van Ommeren, Jos, 2019. "Residential parking costs and car ownership: Implications for parking policy and automated vehicles," Regional Science and Urban Economics, Elsevier, vol. 77(C), pages 276-288.
    2. Yu Sang Chang & Sung Jun Jo & Yoo-Taek Lee & Yoonji Lee, 2021. "Population Density or Populations Size. Which Factor Determines Urban Traffic Congestion?," Sustainability, MDPI, vol. 13(8), pages 1-21, April.
    3. Blaudin de Thé, Camille & Carantino, Benjamin & Lafourcade, Miren, 2021. "The carbon ‘carprint’ of urbanization: New evidence from French cities," Regional Science and Urban Economics, Elsevier, vol. 89(C).
    4. Lewis, Joshua & Severnini, Edson, 2020. "Short- and long-run impacts of rural electrification: Evidence from the historical rollout of the U.S. power grid," Journal of Development Economics, Elsevier, vol. 143(C).
    5. Ahlfeldt, Gabriel M. & Pietrostefani, Elisabetta, 2019. "The economic effects of density: A synthesis," Journal of Urban Economics, Elsevier, vol. 111(C), pages 93-107.
    6. Duranton, Gilles & Puga, Diego, 2015. "Urban Land Use," Handbook of Regional and Urban Economics, in: Gilles Duranton & J. V. Henderson & William C. Strange (ed.), Handbook of Regional and Urban Economics, edition 1, volume 5, chapter 0, pages 467-560, Elsevier.
    7. Chen, Shuo & Lan, Xiaohuan, 2020. "Tractor vs. animal: Rural reforms and technology adoption in China," Journal of Development Economics, Elsevier, vol. 147(C).
    8. Felipe Carozzi & Sandro Provenzano & Sefi Roth, 2024. "Urban density and COVID-19: understanding the US experience," The Annals of Regional Science, Springer;Western Regional Science Association, vol. 72(1), pages 163-194, January.
    9. Steven Spears & Marlon G Boarnet & Douglas Houston, 2017. "Driving reduction after the introduction of light rail transit: Evidence from an experimental-control group evaluation of the Los Angeles Expo Line," Urban Studies, Urban Studies Journal Limited, vol. 54(12), pages 2780-2799, September.
    10. Combes, Pierre-Philippe & Gobillon, Laurent & Zylberberg, Yanos, 2022. "Urban economics in a historical perspective: Recovering data with machine learning," Regional Science and Urban Economics, Elsevier, vol. 94(C).
    11. Kyle C. Meng, 2016. "Estimating Path Dependence in Energy Transitions," NBER Working Papers 22536, National Bureau of Economic Research, Inc.
    12. Miotti, Marco & Needell, Zachary A. & Jain, Rishee K., 2023. "The impact of urban form on daily mobility demand and energy use: Evidence from the United States," Applied Energy, Elsevier, vol. 339(C).
    13. Qing Su, 2017. "Travel Demand Management Policy Instruments, Urban Spatial Characteristics, and Household Greenhouse Gas Emissions from Travel in the US Urban Areas," International Journal of Energy Economics and Policy, Econjournals, vol. 7(3), pages 157-166.
    14. Huang, Robert & Kahn, Matthew E., 2024. "An economic analysis of United States public transit carbon emissions dynamics," Regional Science and Urban Economics, Elsevier, vol. 107(C).
    15. Brad R. Humphreys & Geoffrey Propheter, "undated". "NFL Franchise Departures and Nearby Home Prices," Working Papers 24-05, Department of Economics, West Virginia University.
    16. Agrawal, David R. & Trandel, Gregory A., 2019. "Dynamics of policy adoption with state dependence," Regional Science and Urban Economics, Elsevier, vol. 79(C).
    17. Lee, Sungwon & Lee, Bumsoo, 2014. "The influence of urban form on GHG emissions in the U.S. household sector," Energy Policy, Elsevier, vol. 68(C), pages 534-549.
    18. Danny McGowan & Chrysovalantis Vasilakis, 2015. "Reap What You Sow: Agricultural Productivity, Structural Change and Urbanization," LIDAM Discussion Papers IRES 2015019, Université catholique de Louvain, Institut de Recherches Economiques et Sociales (IRES).
    19. Zhou, You & Zhang, Lingzhu & JF Chiaradia, Alain, 2022. "Estimating wider economic impacts of transport infrastructure Investment: Evidence from accessibility disparity in Hong Kong," Transportation Research Part A: Policy and Practice, Elsevier, vol. 162(C), pages 220-235.
    20. Donn L Feir & Rob Gillezeau & Maggie E C Jones, 2024. "The Slaughter of the Bison and Reversal of Fortunes on the Great Plains," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 91(3), pages 1634-1670.

    More about this item

    Keywords

    ;
    ;
    ;
    ;
    ;

    JEL classification:

    • C80 - Mathematical and Quantitative Methods - - Data Collection and Data Estimation Methodology; Computer Programs - - - General
    • N72 - Economic History - - Economic History: Transport, International and Domestic Trade, Energy, and Other Services - - - U.S.; Canada: 1913-
    • N32 - Economic History - - Labor and Consumers, Demography, Education, Health, Welfare, Income, Wealth, Religion, and Philanthropy - - - U.S.; Canada: 1913-
    • R40 - Urban, Rural, Regional, Real Estate, and Transportation Economics - - Transportation Economics - - - General

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:fip:fedpwp:101850. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Beth Paul (email available below). General contact details of provider: https://edirc.repec.org/data/frbphus.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.