IDEAS home Printed from https://ideas.repec.org/p/cen/tnotes/23-19.html
   My bibliography  Save this paper

Intergenerational Linkages between Historical IRS 1040 Data and the Numident: 1964-1979 Cohorts

Author

Listed:
  • Martha Stinson
  • Laura Weiwu

Abstract

Measures of intergenerational mobility summarize the persistence of income across generations and are important indicators of the long-run dynamics of income inequality. However, studying intergenerational mobility can be challenging as it requires data on both parental income during childhood and income in adulthood for children. The first step for this calculation is a linkage between parents and children. In this report, we document the process we undertook to construct a novel linkage between children born in the years of 1964-1979 and tax filers in IRS Form 1040 data who are candidate parents for these children. These linkages are used in Weiwu (2023) and Stinson, Wang, and Weiwu (2023) to create some of the first measures of intergenerational mobility at large-scale for children born in the mid-20th century period. We construct our linkages using the universe of parent tax filers in the 1974 and 1979 IRS Form 1040 data and the universe of children from the Census Numident in the 1964-1979 birth cohorts. This linkage is necessary because historical 1040 data files before 1994 report the presence of dependents but do not provide a list of children’s identifiers. To create these links, we first restrict to a set of children who were age 10 or under in the tax year. Next, we create blocks of parents and children who are potential matches based on agreement between birth state of the child and tax filing state of the parents. Finally, we use name-matching techniques that incorporate supervised learning methods to compare parent names attached to children’s Numident records to Numident names of adult 1040 tax filers within state blocks. These methods allow us to flexibly compare names with slightly different spellings and to choose among potential parent-to-parent matches based on match probability. To feasibly conduct the matching for a large set of comparisons, we employ parallel cloud computing on Amazon Web services through a Census Bureau pilot program. This report documents the algorithm that is used for assessing the likelihood of a match and the iterative process that identifies a final best match for as many children as possible. We provide match rates for different demographic groups and investigate the representativeness of the matched sample of children.

Suggested Citation

  • Martha Stinson & Laura Weiwu, 2023. "Intergenerational Linkages between Historical IRS 1040 Data and the Numident: 1964-1979 Cohorts," CES Technical Notes Series 23-19, Center for Economic Studies, U.S. Census Bureau.
  • Handle: RePEc:cen:tnotes:23-19
    as

    Download full text from publisher

    File URL: https://www2.census.gov/ces/tn/CES-TN-2023-19.pdf
    File Function: Abstract
    Download Restriction: CES Technical Notes may contain confidential data and, thereby, disclosure is prohibited. Researchers on approved projects (to apply for access, please see https://www.census.gov/ces/rdcresearch/howtoapply.html) with the correct permissions can request full text notes from CES.Technical.Notes.List@census.gov.

    File URL: https://www.census.gov/about/adrm/ced/apply-for-access.html?CES-TN-2023-19
    File Function: Confidential main document
    Download Restriction: Researchers need to have obtained appropriate permissions.
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    More about this item

    Keywords

    Numident; IRS-1040;

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:cen:tnotes:23-19. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Danielle H. Sandler (email available below). General contact details of provider: https://edirc.repec.org/data/cesgvus.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.