IDEAS home Printed from https://ideas.repec.org/a/eee/exehis/v98y2025ics0014498325000646.html

Breakthroughs in historical record linking using genealogy data: The Census Tree project

Author

Listed:
  • Buckles, Kasey
  • Haws, Adrian
  • Price, Joseph
  • Wilbert, Haley E.B.

Abstract

The Census Tree is the largest-ever database of record links among the historical U.S. censuses, with over 700 million links for people living in the United States between 1850 and 1940. To create the Census Tree, we begin with a collection of high-quality links contributed by the users of a free online genealogy platform, many of which would be difficult or impossible to find using currently available linking technologies. We then use these links as training data for a machine learning algorithm to make new matches, and incorporate other recent efforts to link the historical U.S. censuses. Finally, we introduce a procedure for filtering the links and adjudicating disagreements. Our complete Census Tree achieves match rates across adjacent censuses that are between 69 and 86 % for men and between 58 and 79 % for women—a major breakthrough compared to previous linking efforts. The size of the Census Tree allows researchers in the social sciences and other disciplines to construct longitudinal datasets that are highly representative of the population. We validate the accuracy of these links and provide researchers with a simple tool for choosing their preferred tradeoff between sample size and accuracy. To demonstrate the advantages of the Census Tree, we extend the work of Abramitzky, Boustan, Jácome, and Pérez (2021) to include intergenerational mobility estimates for additional immigrant nationalities and for women.

Suggested Citation

  • Buckles, Kasey & Haws, Adrian & Price, Joseph & Wilbert, Haley E.B., 2025. "Breakthroughs in historical record linking using genealogy data: The Census Tree project," Explorations in Economic History, Elsevier, vol. 98(C).
  • Handle: RePEc:eee:exehis:v:98:y:2025:i:c:s0014498325000646
    DOI: 10.1016/j.eeh.2025.101717
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0014498325000646
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.eeh.2025.101717?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to look for a different version below or

    for a different version of it.

    Other versions of this item:

    References listed on IDEAS

    as
    1. Kosack, Edward & Ward, Zachary, 2020. "El Sueño Americano? The Generational Progress of Mexican Americans Prior to World War II," The Journal of Economic History, Cambridge University Press, vol. 80(4), pages 961-995, December.
    2. repec:ehl:lserod:114608 is not listed on IDEAS
    3. Clark, Gregory & Cummins, Neil, 2022. "Assortative mating and the Industrial Revolution: England, 1754-2021," Economic History Working Papers 114608, London School of Economics and Political Science, Department of Economic History.
    4. repec:ehl:lserod:115008 is not listed on IDEAS
    5. Bhashkar Mazumder, 2005. "Fortunate Sons: New Estimates of Intergenerational Mobility in the United States Using Social Security Earnings Data," The Review of Economics and Statistics, MIT Press, vol. 87(2), pages 235-255, May.
    6. Abhay Aneja & Silvia Farina & Guo Xu, 2024. "Beyond the War: Public Service and the Transmission of Gender Norms," NBER Working Papers 32639, National Bureau of Economic Research, Inc.
    7. Samuel Preston & Irma Elo & Andrew Foster & Haishan Fu, 1998. "Reconstructing the size of the African American population by age and sex, 1930–1990," Demography, Springer;Population Association of America (PAA), vol. 35(1), pages 1-21, February.
    8. William J. Collins & Marianne H. Wanamaker, 2022. "African American Intergenerational Economic Mobility since 1880," American Economic Journal: Applied Economics, American Economic Association, vol. 14(3), pages 84-117, July.
    9. Price, Joseph & Buckles, Kasey & Van Leeuwen, Jacob & Riley, Isaac, 2021. "Combining family history and machine learning to link historical records: The Census Tree data set," Explorations in Economic History, Elsevier, vol. 80(C).
    10. Raj Chetty & Nathaniel Hendren & Lawrence F. Katz, 2016. "The Effects of Exposure to Better Neighborhoods on Children: New Evidence from the Moving to Opportunity Experiment," American Economic Review, American Economic Association, vol. 106(4), pages 855-902, April.
    11. Claudia Olivetti & M. Daniele Paserman, 2015. "In the Name of the Son (and the Daughter): Intergenerational Mobility in the United States, 1850-1940," American Economic Review, American Economic Association, vol. 105(8), pages 2695-2724, August.
    12. Bazzi, Samuel & Brodeur, Abel & Fiszbein, Martin & Haddad, Joanne, 2023. "Frontier History and Gender Norms in the United States," CEPR Discussion Papers 18069, Centre for Economic Policy Research.
    13. Jonas Helgertz & Joseph Price & Jacob Wellington & Kelly J Thompson & Steven Ruggles & Catherine A. Fitch, 2022. "A new strategy for linking U.S. historical censuses: A case study for the IPUMS multigenerational longitudinal panel," Historical Methods: A Journal of Quantitative and Interdisciplinary History, Taylor & Francis Journals, vol. 55(1), pages 12-29, January.
    14. Ira Rosenwaike, 1979. "A new evaluation of United States census data on the extreme aged," Demography, Springer;Population Association of America (PAA), vol. 16(2), pages 279-288, May.
    15. Ager, Philipp & Malein, Viktor, 2024. "The Long-term Effects of Charity Nurseries: Evidence from Early 20th Century New York," CEPR Discussion Papers 19317, Centre for Economic Policy Research.
    16. Ran Abramitzky & Leah Boustan & Katherine Eriksson & James Feigenbaum & Santiago Pérez, 2021. "Automated Linking of Historical Data," Journal of Economic Literature, American Economic Association, vol. 59(3), pages 865-918, September.
    17. Berkes, Enrico & Karger, Ezra & Nencka, Peter, 2023. "The census place project: A method for geolocating unstructured place names," Explorations in Economic History, Elsevier, vol. 87(C).
    18. Raj Chetty & Nathaniel Hendren & Patrick Kline & Emmanuel Saez, 2014. "Where is the land of Opportunity? The Geography of Intergenerational Mobility in the United States," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 129(4), pages 1553-1623.
    19. Martha Bailey & Connor Cole & Catherine Massey, 2020. "Simple strategies for improving inference with linked data: a case study of the 1850–1930 IPUMS linked representative historical samples," Historical Methods: A Journal of Quantitative and Interdisciplinary History, Taylor & Francis Journals, vol. 53(2), pages 80-93, April.
    20. James J. Feigenbaum, 2018. "Multiple Measures of Historical Intergenerational Mobility: Iowa 1915 to 1940," Economic Journal, Royal Economic Society, vol. 128(612), pages 446-481, July.
    21. Henrik Kleven & Camille Landais & Jakob Egholt Søgaard, 2019. "Children and Gender Inequality: Evidence from Denmark," American Economic Journal: Applied Economics, American Economic Association, vol. 11(4), pages 181-209, October.
    22. Zachary Ward & Kasey Buckles & Joseph Price, 2025. "Like Great-Grandparent, Like Great-Grandchild? Multigenerational Mobility in American History," NBER Working Papers 33923, National Bureau of Economic Research, Inc.
    23. Zachary Ward, 2023. "Intergenerational Mobility in American History: Accounting for Race and Measurement Error," American Economic Review, American Economic Association, vol. 113(12), pages 3213-3248, December.
    24. Ran Abramitzky & Leah Boustan & Elisa Jacome & Santiago Perez, 2021. "Intergenerational Mobility of Immigrants in the United States over Two Centuries," American Economic Review, American Economic Association, vol. 111(2), pages 580-608, February.
    25. Martha J. Bailey & Connor Cole & Morgan Henderson & Catherine Massey, 2020. "How Well Do Automated Linking Methods Perform? Lessons from US Historical Data," Journal of Economic Literature, American Economic Association, vol. 58(4), pages 997-1044, December.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. is not listed on IDEAS
    2. Jennifer R. Withrow & Kendall A. Houghton & Eva Lyubich & Mary Munro & Suvy Qin & John L. Voorheis, 2024. "The Census Historical Environmental Impacts Frame," Working Papers 24-66, Center for Economic Studies, U.S. Census Bureau.
    3. Berger, Thor & Karadja, Mounir & Prawitz, Erik, 2024. "Cities and the Rise of Working Women," CEPR Discussion Papers 18927, Centre for Economic Policy Research.
    4. Ariadna Jou & Tommy Morgan, 2025. "Do Relief Programs Compensate For Longevity Losses From Reccesions? Evidence From The Great Depression And The New Deal," Working Papers wp562, University of Chile, Department of Economics.
    5. repec:osf:socarx:nt6kg_v1 is not listed on IDEAS
    6. Anna Aizer & Gabrielle Grafton & Santiago Pérez, 2025. "Daughters as Safety Net? Family Responses to Parental Employment Shocks: Evidence from Alcohol Prohibition," NBER Working Papers 33346, National Bureau of Economic Research, Inc.
    7. repec:ces:ceswps:_11729 is not listed on IDEAS
    8. Andrea Del Pizzo & Martin Nybom & Jan Stuhler, 2026. "Indirect Estimators of Intergenerational Mobility," RFBerlin Discussion Paper Series 26137, ROCKWOOL Foundation Berlin (RFBerlin).
    9. Vidart, Daniela, 2026. "Revisiting the link between electrification and fertility: Evidence from the early 20th-century United States," Explorations in Economic History, Elsevier, vol. 100(C).
    10. Ager, Philipp & Malein, Viktor, 2024. "The Long-term Effects of Charity Nurseries: Evidence from Early 20th Century New York," CEPR Discussion Papers 19317, Centre for Economic Policy Research.
    11. Hutchings, Jacob & Lleras-Muney, Adriana & Nicholls, Joshua & Price, Joseph & Wilson, Sven E, 2025. "Long-run patterns in the spousal correlation of lifespan," The Journal of the Economics of Ageing, Elsevier, vol. 32(C).
    12. Eric S. M. Protzer & Sultan Orazbayev & Andres Gomez-Lievano & Matte Hartog & Frank Neffke, 2024. "A New Algorithm to Efficiently Match U.S. Census Records and Balance Representativity with Match Quality," Growth Lab Working Papers 238, Harvard's Growth Lab.
    13. Philipp Ager & Casper W. Hansen & Peter Z. Lin, 2026. "Medical Technology and Life Expectancy: Evidence from the Antitoxin Treatment of Diphtheria," American Economic Journal: Economic Policy, American Economic Association, vol. 18(2), pages 441-476, May.
    14. Samuel Bazzi & Abel Brodeur & Martin Fiszbein & Joanne Haddad, 2023. "Frontier History and Gender Norms in the United States," NBER Working Papers 31079, National Bureau of Economic Research, Inc.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Andrea Del Pizzo & Martin Nybom & Jan Stuhler, 2026. "Indirect Estimators of Intergenerational Mobility," RFBerlin Discussion Paper Series 26137, ROCKWOOL Foundation Berlin (RFBerlin).
    2. Hwang, Sam Il Myoung & Squires, Munir, 2024. "Linked samples and measurement error in historical US census data," Explorations in Economic History, Elsevier, vol. 93(C).
    3. Martha J. Bailey & Peter Z. Lin, 2024. "Marital Matching and Women’s Intergenerational Mobility in the Late 19th and Early 20th Century US," NBER Chapters, in: The Economic History of American Inequality: New Evidence and Perspectives, pages 165-196, National Bureau of Economic Research, Inc.
    4. Ran Abramitzky & Leah Platt Boustan & Elisa Jácome & Santiago Pérez, 2019. "Intergenerational Mobility of Immigrants over Two Centuries," Working Papers 2019-6, Princeton University. Economics Department..
    5. Zachary Ward, 2023. "Intergenerational Mobility in American History: Accounting for Race and Measurement Error," American Economic Review, American Economic Association, vol. 113(12), pages 3213-3248, December.
    6. Feigenbaum, James J & Helgertz, Jonas & Price, Joseph, 2025. "Examining the role of training data for supervised methods of automated record linkage: Lessons for best practice in economic history," Explorations in Economic History, Elsevier, vol. 96(C).
    7. Martha Bailey & Paul Mohnen & A.R. Shariq Mohammed, 2026. "The Evolution of U.S. Educational Mobility over the 20th Century and the Role of Public Education," FRB Atlanta Working Paper 2026-1, Federal Reserve Bank of Atlanta.
    8. Diego Battiston & Stephan Maurer & Andrei Potlogea & Jose V. Rodriguez Mora, 2025. "The Short and Long Run Dynamics of the Great Gatsby Curve," Edinburgh School of Economics Discussion Paper Series 324, Edinburgh School of Economics, University of Edinburgh.
    9. Elisa Jácome & Ilyana Kuziemko & Suresh Naidu, 2021. "Mobility for All: Representative Intergenerational Mobility Estimates over the 20th Century," Working Papers 302, Princeton University, Department of Economics, Center for Economic Policy Studies..
    10. Juliana Jaramillo-Echeverri, 2024. "Movilidad social en la educación: el caso de la Universidad de los Andes en Colombia entre 1949 y 2018," Cuadernos de Historia Económica 61, Banco de la Republica de Colombia.
    11. Ran Abramitzky & Leah Platt Boustan & Elisa Jácome & Santiago Pérez, 2019. "Intergenerational Mobility of Immigrants in the US over Two Centuries," NBER Working Papers 26408, National Bureau of Economic Research, Inc.
    12. Eric S. M. Protzer & Sultan Orazbayev & Andres Gomez-Lievano & Matte Hartog & Frank Neffke, 2024. "A New Algorithm to Efficiently Match U.S. Census Records and Balance Representativity with Match Quality," Growth Lab Working Papers 238, Harvard's Growth Lab.
    13. Wolfgang Keller & Carol H. Shiue, 2023. "Intergenerational Mobility of Daughters and Marital Sorting: New Evidence from Imperial China," NBER Working Papers 31695, National Bureau of Economic Research, Inc.
    14. Escamilla-Guerrero, David & Kosack, Edward & Ward, Zachary, 2021. "Life after crossing the border: Assimilation during the first Mexican mass migration," Explorations in Economic History, Elsevier, vol. 82(C).
    15. Dylan Shane Connor & Michael Storper, 2020. "The changing geography of social mobility in the United States," Proceedings of the National Academy of Sciences, Proceedings of the National Academy of Sciences, vol. 117(48), pages 30309-30317, December.
    16. Valerie Michelman & Joseph Price & Seth D Zimmerman, 2022. "Old Boys’ Clubs and Upward Mobility Among the Educational Elite [Do Immigrants Assimilate More Slowly Today Than in the Past?]," The Quarterly Journal of Economics, Oxford University Press, vol. 137(2), pages 845-909.
    17. Juli√°n Costas-Fern√°ndez & JosÔøΩ-Alberto Guerra & Myra Mohnen, 2020. "Train to Opportunity: the Effect of Infrastructure on Intergenerational Mobility," Documentos CEDE 18591, Universidad de los Andes, Facultad de Economía, CEDE.
    18. Feng, Qundi & He, Qinying, 2022. "Does parental migration increase upward intergenerational mobility? Evidence from rural China," Economic Modelling, Elsevier, vol. 115(C).
    19. Ariadna Jou & Tommy Morgan, 2025. "Do Relief Programs Compensate For Longevity Losses From Reccesions? Evidence From The Great Depression And The New Deal," Working Papers wp562, University of Chile, Department of Economics.
    20. Pablo Celhay & Sebastian Gallegos, 2025. "Schooling mobility across three generations in six Latin American countries," Journal of Population Economics, Springer;European Society for Population Economics, vol. 38(1), pages 1-35, March.

    More about this item

    Keywords

    ;
    ;
    ;
    ;
    ;
    ;

    JEL classification:

    • C81 - Mathematical and Quantitative Methods - - Data Collection and Data Estimation Methodology; Computer Programs - - - Methodology for Collecting, Estimating, and Organizing Microeconomic Data; Data Access
    • J10 - Labor and Demographic Economics - - Demographic Economics - - - General
    • N01 - Economic History - - General - - - Development of the Discipline: Historiographical; Sources and Methods

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:exehis:v:98:y:2025:i:c:s0014498325000646. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/inca/622830 .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.