IDEAS home Printed from https://ideas.repec.org/a/eee/exehis/v87y2023ics0014498322000547.html
   My bibliography  Save this article

Perks and pitfalls of city directories as a micro-geographic data source

Author

Listed:
  • Albers, Thilo N.H.
  • Kappner, Kalle

Abstract

Historical city directories are rich sources of micro-geographic data. They provide information on the location of households and firms and their occupations and industries, respectively. We develop a generic algorithmic work flow that converts scans of them into geo- and status-referenced household-level data sets. Applying the work flow to our case study, the Berlin 1880 directory, adds idiosyncratic challenges that should make automation less attractive. Yet, employing an administrative benchmark data set on household counts, incomes, and income distributions across more than 200 census tracts, we show that semi-automatic referencing yields results very similar to those from labour-intensive manual referencing. Finally, we discuss how to scale the work flow to other years and cities as well as potential applications in economic history and beyond.

Suggested Citation

  • Albers, Thilo N.H. & Kappner, Kalle, 2023. "Perks and pitfalls of city directories as a micro-geographic data source," Explorations in Economic History, Elsevier, vol. 87(C).
  • Handle: RePEc:eee:exehis:v:87:y:2023:i:c:s0014498322000547
    DOI: 10.1016/j.eeh.2022.101476
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0014498322000547
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.eeh.2022.101476?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Combes, Pierre-Philippe & Gobillon, Laurent & Zylberberg, Yanos, 2022. "Urban economics in a historical perspective: Recovering data with machine learning," Regional Science and Urban Economics, Elsevier, vol. 94(C).
    2. Siodla, James, 2021. "Firms, fires, and firebreaks: The impact of the 1906 San Francisco disaster on business agglomeration," Regional Science and Urban Economics, Elsevier, vol. 88(C).
    3. Stephan Heblich & Alex Trew & Yanos Zylberberg, 2021. "East-Side Story: Historical Pollution and Persistent Neighborhood Sorting," Journal of Political Economy, University of Chicago Press, vol. 129(5), pages 1508-1552.
    4. Alex Anas & Richard Arnott & Kenneth A. Small, 1998. "Urban Spatial Structure," Journal of Economic Literature, American Economic Association, vol. 36(3), pages 1426-1464, September.
    5. Gabriel M. Ahlfeldt & Stephen J. Redding & Daniel M. Sturm & Nikolaus Wolf, 2015. "The Economics of Density: Evidence From the Berlin Wall," Econometrica, Econometric Society, vol. 83, pages 2127-2189, November.
    6. Richard Hornbeck & Daniel Keniston, 2017. "Creative Destruction: Barriers to Urban Growth and the Great Boston Fire of 1872," American Economic Review, American Economic Association, vol. 107(6), pages 1365-1398, June.
    7. Ahlfeldt, Gabriel Peter Gabriel Martins & Albers, Thilo Nils Hendrix & Behrens, Kristian, 2020. "Prime locations," Economic History Working Papers 108470, London School of Economics and Political Science, Department of Economic History.
    8. Jeremiah E. Dittmar, 2011. "Information Technology and Economic Change: The Impact of The Printing Press," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 126(3), pages 1133-1172.
    9. Gregory Clark & Neil Cummins, 2015. "Intergenerational Wealth Mobility in England, 1858–2012: Surnames and Social Mobility," Economic Journal, Royal Economic Society, vol. 125(582), pages 61-85, February.
    10. Bosker, Maarten & Buringh, Eltjo, 2017. "City seeds: Geography and the origins of the European city system," Journal of Urban Economics, Elsevier, vol. 98(C), pages 139-157.
    11. Daniel Aaronson & Daniel Hartley & Bhashkar Mazumder, 2021. "The Effects of the 1930s HOLC "Redlining" Maps," American Economic Journal: Economic Policy, American Economic Association, vol. 13(4), pages 355-392, November.
    12. Sergio Correia & Paulo Guimarães & Tom Zylkin, 2020. "Fast Poisson estimation with high-dimensional fixed effects," Stata Journal, StataCorp LP, vol. 20(1), pages 95-115, March.
    13. Siodla, James, 2017. "Clean slate: Land-use changes in San Francisco after the 1906 disaster," Explorations in Economic History, Elsevier, vol. 65(C), pages 1-16.
    14. Chiswick, Barry R. & Robinson, RaeAnn Halenda, 2021. "Women at work in the United States since 1860: An analysis of unreported family workers," Explorations in Economic History, Elsevier, vol. 82(C).
    15. Jeffrey Brinkman & Jeffrey Lin, 2022. "Freeway Revolts! The Quality of Life Effects of Highways," Working Papers 22-24, Federal Reserve Bank of Philadelphia.
    16. Paul S. Lambert & Richard L. Zijdeman & Marco H. D. Van Leeuwen & Ineke Maas & Kenneth Prandy, 2013. "The Construction of HISCAM: A Stratification Scale Based on Social Interactions for Historical Comparative Research," Historical Methods: A Journal of Quantitative and Interdisciplinary History, Taylor & Francis Journals, vol. 46(2), pages 77-89, June.
    17. Samuel Bell & Thomas Marlow & Kai Wombacher & Anina Hitt & Neev Parikh & Andras Zsom & Scott Frickel, 2020. "Automated data extraction from historical city directories: The rise and fall of mid-century gas stations in Providence, RI," PLOS ONE, Public Library of Science, vol. 15(8), pages 1-12, August.
    18. Maarten Bosker & Eltjo Buringh & Jan Luiten van Zanden, 2013. "From Baghdad to London: Unraveling Urban Development in Europe, the Middle East, and North Africa, 800–1800," The Review of Economics and Statistics, MIT Press, vol. 95(4), pages 1418-1437, October.
    19. Dora L. Costa & Matthew E. Kahn, 2015. "Declining Mortality Inequality within Cities during the Health Transition," American Economic Review, American Economic Association, vol. 105(5), pages 564-569, May.
    20. Martha J. Bailey & Connor Cole & Morgan Henderson & Catherine Massey, 2020. "How Well Do Automated Linking Methods Perform? Lessons from US Historical Data," Journal of Economic Literature, American Economic Association, vol. 58(4), pages 997-1044, December.
    21. Janet Currie & Henrik Kleven & Esmée Zwiers, 2020. "Technology and Big Data Are Changing Economics: Mining Text to Track Methods," AEA Papers and Proceedings, American Economic Association, vol. 110, pages 42-48, May.
    22. Christian M. Dahl & Torben S. D. Johansen & Emil N. S{o}rensen & Christian E. Westermann & Simon F. Wittrock, 2021. "Applications of Machine Learning in Document Digitisation," Papers 2102.03239, arXiv.org.
    23. Brian Beach & John Parman & Martin Saavedra, 2022. "Segregation and the Initial Provision of Water in the United States," AEA Papers and Proceedings, American Economic Association, vol. 112, pages 193-198, May.
    24. Chiswick, Barry R. & Robinson, RaeAnn Halenda, 2021. "Women at Work in the United States Since 1860: An Analysis of Unreported Family Workers," GLO Discussion Paper Series 857, Global Labor Organization (GLO).
    25. McDonald, John F., 1989. "Econometric studies of urban population density: A survey," Journal of Urban Economics, Elsevier, vol. 26(3), pages 361-385, November.
    26. Edward L. Glaeser, 2021. "What Can Developing Cities Today Learn From the Urban Past?," NBER Working Papers 28814, National Bureau of Economic Research, Inc.
    27. Combes, Pierre-Philippe & Gobillon, Laurent & Zylberberg, Yanos, 2022. "Urban economics in a historical perspective: Recovering data with machine learning," Regional Science and Urban Economics, Elsevier, vol. 94(C).
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Albers, Thilo N. H. & Kappner, Kalle, 2022. "Perks and Pitfalls of City Directories as a Micro-Geographic Data Source," Rationality and Competition Discussion Paper Series 315, CRC TRR 190 Rationality and Competition.
    2. Hanlon, W.Walker & Heblich, Stephan, 2022. "History and urban economics," Regional Science and Urban Economics, Elsevier, vol. 94(C).
    3. Ahlfeldt, Gabriel M. & Barr, Jason, 2022. "The economics of skyscrapers: A synthesis," Journal of Urban Economics, Elsevier, vol. 129(C).
    4. Lin, Jeffrey & Rauch, Ferdinand, 2022. "What future for history dependence in spatial economics?," Regional Science and Urban Economics, Elsevier, vol. 94(C).
    5. Combes, Pierre-Philippe & Gobillon, Laurent & Zylberberg, Yanos, 2022. "Urban economics in a historical perspective: Recovering data with machine learning," Regional Science and Urban Economics, Elsevier, vol. 94(C).
    6. Ferdinand Rauch & Guy Michaels, 2013. "Resetting the Urban Network: 117-2012," Economics Series Working Papers 684, University of Oxford, Department of Economics.
    7. Matthew Jaremski, 2020. "Today’s economic history and tomorrow’s scholars," Cliometrica, Springer;Cliometric Society (Association Francaise de Cliométrie), vol. 14(1), pages 169-180, January.
    8. Julián Costas-Fernández & José-Alberto Guerra & Myra Mohnen, 2020. "Train to Opportunity: the Effect of Infrastructure on Intergenerational Mobility," Documentos CEDE 18591, Universidad de los Andes, Facultad de Economía, CEDE.
    9. Stef Proost & Jacques-François Thisse, 2019. "What Can Be Learned from Spatial Economics?," Journal of Economic Literature, American Economic Association, vol. 57(3), pages 575-643, September.
    10. Jedwab, Remi & Johnson, Noel D. & Koyama, Mark, 2022. "Medieval cities through the lens of urban economics," Regional Science and Urban Economics, Elsevier, vol. 94(C).
    11. Dahl, Christian M. & Johansen, Torben S.D. & Sørensen, Emil N. & Wittrock, Simon, 2023. "HANA: A handwritten name database for offline handwritten text recognition," Explorations in Economic History, Elsevier, vol. 87(C).
    12. Gerard H Dericks & Hans R A Koster, 2021. "The billion pound drop: the Blitz and agglomeration economies in London [The economics of density: evidence from the Berlin wall]," Journal of Economic Geography, Oxford University Press, vol. 21(6), pages 869-897.
    13. Youssouf Merouani & Faustine Perrin, 2022. "Gender and the long-run development process. A survey of the literature [Rethinking age heaping: A cautionary tale from nineteenth-century Italy]," European Review of Economic History, European Historical Economics Society, vol. 26(4), pages 612-641.
    14. Nagy, Dávid Krisztián, 2022. "Quantitative economic geography meets history: Questions, answers and challenges," Regional Science and Urban Economics, Elsevier, vol. 94(C).
    15. Ahlfeldt, Gabriel M. & Barr, Jason, 2022. "Viewing urban spatial history from tall buildings," Regional Science and Urban Economics, Elsevier, vol. 94(C).
    16. Duranton, Gilles & Puga, Diego, 2015. "Urban Land Use," Handbook of Regional and Urban Economics, in: Gilles Duranton & J. V. Henderson & William C. Strange (ed.), Handbook of Regional and Urban Economics, edition 1, volume 5, chapter 0, pages 467-560, Elsevier.
    17. Fabian Wahl, 2017. "Does European development have Roman roots? Evidence from the German Limes," Journal of Economic Growth, Springer, vol. 22(3), pages 313-349, September.
    18. Siodla, James, 2021. "Firms, fires, and firebreaks: The impact of the 1906 San Francisco disaster on business agglomeration," Regional Science and Urban Economics, Elsevier, vol. 88(C).
    19. Kappner, Kalle, 2018. "Persistent shocks to urban density: Evidence from the Berlin air raids," Economics Letters, Elsevier, vol. 168(C), pages 37-41.
    20. Salazar Miranda, Arianna, 2022. "The micro persistence of layouts and design: Quasi-experimental evidence from the United States Housing Corporation," Regional Science and Urban Economics, Elsevier, vol. 95(C).

    More about this item

    Keywords

    city directories; data extraction; granular spatial data;
    All these keywords.

    JEL classification:

    • C8 - Mathematical and Quantitative Methods - - Data Collection and Data Estimation Methodology; Computer Programs
    • R1 - Urban, Rural, Regional, Real Estate, and Transportation Economics - - General Regional Economics
    • N9 - Economic History - - Regional and Urban History

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:exehis:v:87:y:2023:i:c:s0014498322000547. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/inca/622830 .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.