IDEAS home Printed from https://ideas.repec.org/p/cen/wpaper/19-08.html
   My bibliography  Save this paper

Optimal Probabilistic Record Linkage: Best Practice for Linking Employers in Survey and Administrative Data

Author

Listed:
  • John M. Abowd
  • Joelle Abramowitz
  • Margaret C. Levenstein
  • Kristin McCue
  • Dhiren Patki
  • Trivellore Raghunathan
  • Ann M. Rodgers
  • Matthew D. Shapiro
  • Nada Wasi

Abstract

This paper illustrates an application of record linkage between a household-level survey and an establishment-level frame in the absence of unique identifiers. Linkage between frames in this setting is challenging because the distribution of employment across firms is highly asymmetric. To address these difficulties, this paper uses a supervised machine learning model to probabilistically link survey respondents in the Health and Retirement Study (HRS) with employers and establishments in the Census Business Register (BR) to create a new data source which we call the CenHRS. Multiple imputation is used to propagate uncertainty from the linkage step into subsequent analyses of the linked data. The linked data reveal new evidence that survey respondents’ misreporting and selective nonresponse about employer characteristics are systematically correlated with wages.

Suggested Citation

  • John M. Abowd & Joelle Abramowitz & Margaret C. Levenstein & Kristin McCue & Dhiren Patki & Trivellore Raghunathan & Ann M. Rodgers & Matthew D. Shapiro & Nada Wasi, 2019. "Optimal Probabilistic Record Linkage: Best Practice for Linking Employers in Survey and Administrative Data," Working Papers 19-08, Center for Economic Studies, U.S. Census Bureau.
  • Handle: RePEc:cen:wpaper:19-08
    as

    Download full text from publisher

    File URL: https://www2.census.gov/ces/wp/2019/CES-WP-19-08.pdf
    File Function: First version, 2019
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. John M. Abowd & Martha H. Stinson, 2013. "Estimating Measurement Error in Annual Job Earnings: A Comparison of Survey and Administrative Data," The Review of Economics and Statistics, MIT Press, vol. 95(5), pages 1451-1467, December.
    2. Oi, Walter Y. & Idson, Todd L., 1999. "Firm size and wages," Handbook of Labor Economics, in: O. Ashenfelter & D. Card (ed.), Handbook of Labor Economics, edition 1, volume 3, chapter 33, pages 2165-2214, Elsevier.
    3. P. Lahiri & Michael D. Larsen, 2005. "Regression Analysis With Linked Data," Journal of the American Statistical Association, American Statistical Association, vol. 100, pages 222-230, March.
    4. Martha J. Bailey & Connor Cole & Morgan Henderson & Catherine Massey, 2020. "How Well Do Automated Linking Methods Perform? Lessons from US Historical Data," Journal of Economic Literature, American Economic Association, vol. 58(4), pages 997-1044, December.
    5. John M. Abowd & Bryce E. Stephens & Lars Vilhuber & Fredrik Andersson & Kevin L. McKinney & Marc Roemer & Simon Woodcock, 2009. "The LEHD Infrastructure Files and the Creation of the Quarterly Workforce Indicators," NBER Chapters, in: Producer Dynamics: New Evidence from Micro Data, pages 149-230, National Bureau of Economic Research, Inc.
    6. Nicholas Bloom & Fatih Guvenen & Benjamin S. Smith & Jae Song & Till von Wachter, 2018. "The Disappearing Large-Firm Wage Premium," AEA Papers and Proceedings, American Economic Association, vol. 108, pages 317-322, May.
    7. Hui Zou & Trevor Hastie, 2005. "Addendum: Regularization and variable selection via the elastic net," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 67(5), pages 768-768, November.
    8. Andrea Tancredi & Brunero Liseo, 2015. "Regression analysis with linked data: problems and possible solutions," Statistica, Department of Statistics, University of Bologna, vol. 75(1), pages 19-35.
    9. Brown, Charles & Medoff, James, 1989. "The Employer Size-Wage Effect," Journal of Political Economy, University of Chicago Press, vol. 97(5), pages 1027-1059, October.
    10. Timothy Dunne & J. Bradford Jensen & Mark J. Roberts, 2009. "Producer Dynamics: New Evidence from Micro Data," NBER Books, National Bureau of Economic Research, Inc, number dunn05-1, July.
    11. Hui Zou & Trevor Hastie, 2005. "Regularization and variable selection via the elastic net," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 67(2), pages 301-320, April.
    12. Kevin L. McKinney & Andrew S. Green & Lars Vilhuber & John M. Abowd, 2017. "Total Error and Variability Measures with Integrated Disclosure Limitation for Quarterly Workforce Indicators and LEHD Origin Destination Employment Statistics in On The Map," Working Papers 17-71, Center for Economic Studies, U.S. Census Bureau.
    13. Ron S Jarmin & Javier Miranda, 2002. "The Longitudinal Business Database," Working Papers 02-17, Center for Economic Studies, U.S. Census Bureau.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Nada Wasi & Sasiwimon Warunsiri Paweenawat & Chinnawat Devahastin Na Ayudhya & Pucktada Treeratpituk & Chommanart Nittayo, 2019. "Labor Income Inequality in Thailand: the Roles of Education, Occupation and Employment History," PIER Discussion Papers 117, Puey Ungphakorn Institute for Economic Research.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. John M. Abowd & Joelle Abramowitz & Margaret C. Levenstein & Kristin McCue & Dhiren Patki & Trivellore Raghunathan & Ann M. Rodgers & Matthew D. Shapiro & Nada Wasi & Dawn Zinsser, 2021. "Finding Needles in Haystacks: Multiple-Imputation Record Linkage Using Machine Learning," Working Papers 21-35, Center for Economic Studies, U.S. Census Bureau.
    2. Nicholas Bloom & Scott Ohlmacher & Cristina Tello-Trillo & Melanie Wallskog, 2021. "Pay, Productivity and Management," Working Papers 21-31, Center for Economic Studies, U.S. Census Bureau.
    3. Tania Babina & Wenting Ma & Christian Moser & Paige Ouimet & Rebecca Zarutskie, 2019. "Pay, Employment, and Dynamics of Young Firms," Working Papers 19-23, Center for Economic Studies, U.S. Census Bureau.
    4. Ouimet, Paige & Zarutskie, Rebecca, 2014. "Who works for startups? The relation between firm age, employee age, and growth," Journal of Financial Economics, Elsevier, vol. 112(3), pages 386-407.
    5. Hartmut Egger & Elke Jahn & Stefan Kornitzky, 2021. "How Does the Position in Business Group Hierarchies Affect Workers’ Wages?," Working Papers 213, Bavarian Graduate Program in Economics (BGPE).
    6. Henry Hyatt & Erika McEntarfer & John Haltiwanger, 2014. "Cyclical Reallocation of Workers Across Large and Small Employers," 2014 Meeting Papers 735, Society for Economic Dynamics.
    7. Emin Dinlersoz & Henry Hyatt & Hubert Janicki, 2019. "Who Works for Whom? Worker Sorting in a Model of Entrepreneurship with Heterogeneous Labor Markets," Review of Economic Dynamics, Elsevier for the Society for Economic Dynamics, vol. 34, pages 244-266, October.
    8. Jahangir Alam M. & Dostie Benoit & Drechsler Jörg & Vilhuber Lars, 2020. "Applying data synthesis for longitudinal business data across three countries," Statistics in Transition New Series, Polish Statistical Association, vol. 21(4), pages 212-236, August.
    9. C. J. Krizan & Adela Luque & Alice Zawacki, 2014. "The Effect Of Employer Health Insurance Offering On The Growth And Survival Of Small Business Prior To The Affordable Care Act," Working Papers 14-22, Center for Economic Studies, U.S. Census Bureau.
    10. Brianna Cardiff-Hicks & Francine Lafontaine & Kathryn Shaw, 2015. "Do Large Modern Retailers Pay Premium Wages?," ILR Review, Cornell University, ILR School, vol. 68(3), pages 633-665, May.
    11. Melanie Jones & Ezgi Kaya, 2023. "The UK gender pay gap: Does firm size matter?," Economica, London School of Economics and Political Science, vol. 90(359), pages 937-952, July.
    12. Jahn, Elke & Egger, Hartmut & Kornitzky, Stefan, 2021. "Does the Position in Business Group Hierarchies Affect Workers' Wages?," VfS Annual Conference 2021 (Virtual Conference): Climate Economics 242374, Verein für Socialpolitik / German Economic Association.
    13. Egger, Hartmut & Jahn, Elke & Kornitzky, Stefan, 2022. "How does the position in business group hierarchies affect workers’ wages?," Journal of Economic Behavior & Organization, Elsevier, vol. 194(C), pages 244-263.
    14. Jaime Arellano-Bover, 2024. "Career Consequences of Firm Heterogeneity for Young Workers: First Job and Firm Size," Journal of Labor Economics, University of Chicago Press, vol. 42(2), pages 549-589.
    15. Fariha Kamal & Asha Sundaram & Cristina J. Tello-Trillo, 2020. "Family-Leave Mandates and Female Labor at U.S. Firms: Evidence from a Trade Shock," Working Papers 20-25, Center for Economic Studies, U.S. Census Bureau.
    16. Paige Ouimet & Rebecca Zarutskie, 2011. "Acquiring Labor," Working Papers 11-32, Center for Economic Studies, U.S. Census Bureau.
    17. Paige Ouimet & Rebecca Zarutskie, 2011. "Who Works for Startups? The Relation between Firm Age, Employee Age, and Growth," Working Papers 11-31, Center for Economic Studies, U.S. Census Bureau.
    18. John M. Abowd & Ian M. Schmutte & Lars Vilhuber, 2018. "Disclosure Limitation and Confidentiality Protection in Linked Data," Working Papers 18-07, Center for Economic Studies, U.S. Census Bureau.
    19. Tutz, Gerhard & Pößnecker, Wolfgang & Uhlmann, Lorenz, 2015. "Variable selection in general multinomial logit models," Computational Statistics & Data Analysis, Elsevier, vol. 82(C), pages 207-222.
    20. Oxana Babecka Kucharcukova & Jan Bruha, 2016. "Nowcasting the Czech Trade Balance," Working Papers 2016/11, Czech National Bank.

    More about this item

    Keywords

    Probabilistic record linkage; survey data; administrative data; multiple imputation; measurement error; nonresponse;
    All these keywords.

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:cen:wpaper:19-08. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Dawn Anderson (email available below). General contact details of provider: https://edirc.repec.org/data/cesgvus.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.