IDEAS home Printed from https://ideas.repec.org/p/nbr/nberwo/31844.html
   My bibliography  Save this paper

Do Two Wrongs Make a Right? Measuring the Effect of Publications on Science Careers

Author

Listed:
  • Donna K. Ginther
  • Carlos Zambrana
  • Patricia Oslund
  • Wan-Ying Chang

Abstract

This paper examines whether publication data matched to the Survey of Doctorate Recipients can be used for research purposes. We use Gold Standard data created to validate the publication match quality and compare these measures to publications assigned by a machine-learning algorithm developed by Thomson Reuters (now Clarivate). Our econometric model demonstrates that publications likely suffer from non-classical measurement error. Using horse race and instrumental variable models, we confirm that the Gold Standard data are relatively free from measurement error but show that the Clarivate data suffer from non-classical measurement error. We employ a variety of methods to adjust the Clarivate data for false negatives and false positives and demonstrate that with these adjustments the data produce estimates very similar to the Gold Standard. However, these adjustments are not as useful when publications are used as a dependent variable. We recommend using subsamples of the data that have better match quality when using the Clarivate data as a dependent variable.

Suggested Citation

  • Donna K. Ginther & Carlos Zambrana & Patricia Oslund & Wan-Ying Chang, 2023. "Do Two Wrongs Make a Right? Measuring the Effect of Publications on Science Careers," NBER Working Papers 31844, National Bureau of Economic Research, Inc.
  • Handle: RePEc:nbr:nberwo:31844
    Note: LS
    as

    Download full text from publisher

    File URL: http://www.nber.org/papers/w31844.pdf
    Download Restriction: no
    ---><---

    Other versions of this item:

    References listed on IDEAS

    as
    1. Aigner, Dennis J., 1973. "Regression with a binary independent variable subject to errors of observation," Journal of Econometrics, Elsevier, vol. 1(1), pages 49-59, March.
    2. Donna K Ginther & Jodi Basner & Unni Jensen & Joshua Schnell & Raynard Kington & Walter T Schaffer, 2018. "Publications as predictors of racial and ethnic differences in NIH research awards," PLOS ONE, Public Library of Science, vol. 13(11), pages 1-24, November.
    3. Xuan Jiang & Wan-Ying Chang & Bruce A Weinberg, 2021. "Man versus machine? Self-reports versus algorithmic measurement of publications," PLOS ONE, Public Library of Science, vol. 16(9), pages 1-22, September.
    4. Bollinger, Christopher R., 1996. "Bounding mean regressions when a binary regressor is mismeasured," Journal of Econometrics, Elsevier, vol. 73(2), pages 387-399, August.
    5. AIGNER, Dennis J., 1973. "Regression with a binary independent variable subject to errors of observation," LIDAM Reprints CORE 130, Université catholique de Louvain, Center for Operations Research and Econometrics (CORE).
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Nguimkeu, Pierre & Denteh, Augustine & Tchernis, Rusty, 2019. "On the estimation of treatment effects with endogenous misreporting," Journal of Econometrics, Elsevier, vol. 208(2), pages 487-506.
    2. Adele Bergin, 2015. "Employer Changes and Wage Changes: Estimation with Measurement Error in a Binary Variable," LABOUR, CEIS, vol. 29(2), pages 194-223, June.
    3. Brent Kreider & Steven C. Hill, 2009. "Partially Identifying Treatment Effects with an Application to Covering the Uninsured," Journal of Human Resources, University of Wisconsin Press, vol. 44(2).
    4. Santiago Acerenza & Kyunghoon Ban & D'esir'e K'edagni, 2021. "Local Average and Marginal Treatment Effects with a Misclassified Treatment," Papers 2105.00358, arXiv.org, revised Sep 2024.
    5. DiTraglia, Francis J. & García-Jimeno, Camilo, 2019. "Identifying the effect of a mis-classified, binary, endogenous regressor," Journal of Econometrics, Elsevier, vol. 209(2), pages 376-390.
    6. Kyung Min Kang & Robert A. Moffitt, 2019. "The Effect of SNAP and School Food Programs on Food Security, Diet Quality, and Food Spending: Sensitivity to Program Reporting Error," Southern Economic Journal, John Wiley & Sons, vol. 86(1), pages 156-201, July.
    7. Christopher R. Bollinger, 2001. "Response Error and the Union Wage Differential," Southern Economic Journal, John Wiley & Sons, vol. 68(1), pages 60-76, July.
    8. Craig Gundersen & Brent Kreider, 2008. "Food Stamps and Food Insecurity: What Can Be Learned in the Presence of Nonclassical Measurement Error?," Journal of Human Resources, University of Wisconsin Press, vol. 43(2), pages 352-382.
    9. Kreider, Brent & Pepper, John V., 2007. "Disability and Employment: Reevaluating the Evidence in Light of Reporting Errors," Journal of the American Statistical Association, American Statistical Association, vol. 102, pages 432-441, June.
    10. Zhang, Han, 2021. "How Using Machine Learning Classification as a Variable in Regression Leads to Attenuation Bias and What to Do About It," SocArXiv 453jk, Center for Open Science.
    11. Violeta Misheva & Dinand Webbink & Nicholas G. Martin, 2017. "The effect of child maltreatment on illegal and problematic behaviour: new evidence on the ‘cycle of violence’ using twins data," Journal of Population Economics, Springer;European Society for Population Economics, vol. 30(4), pages 1035-1067, October.
    12. Frazis, Harley & Loewenstein, Mark A., 2003. "Estimating linear regressions with mismeasured, possibly endogenous, binary explanatory variables," Journal of Econometrics, Elsevier, vol. 117(1), pages 151-178, November.
    13. Wossen, Tesfamicheal & Abay, Kibrom A. & Abdoulaye, Tahirou, 2022. "Misperceiving and misreporting input quality: Implications for input use and productivity," Journal of Development Economics, Elsevier, vol. 157(C).
    14. Erich Battistin & Barbara Sianesi, 2006. "Misreported schooling and returns to education: evidence from the UK," CeMMAP working papers CWP07/06, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    15. Brent Kreider & Richard J. Manski & John Moeller & John Pepper, 2015. "The Effect of Dental Insurance on the Use of Dental Care for Older Adults: A Partial Identification Analysis," Health Economics, John Wiley & Sons, Ltd., vol. 24(7), pages 840-858, July.
    16. Hu, Yingyao, 2008. "Identification and estimation of nonlinear models with misclassification error using instrumental variables: A general solution," Journal of Econometrics, Elsevier, vol. 144(1), pages 27-61, May.
    17. Francis DiTraglia & Camilo Garcia-Jimeno, 2015. "On Mis-measured Binary Regressors: New Results And Some Comments on the Literature," PIER Working Paper Archive 15-037, Penn Institute for Economic Research, Department of Economics, University of Pennsylvania, revised 02 Nov 2015.
    18. Maury Gittleman & Morris M. Kleiner, 2016. "Wage Effects of Unionization and Occupational Licensing Coverage in the United States," ILR Review, Cornell University, ILR School, vol. 69(1), pages 142-172, January.
    19. Winter, Joachim, 0000. "Bracketing effects in categorized survey questions and the measurement of economic quantities," Sonderforschungsbereich 504 Publications 02-35, Sonderforschungsbereich 504, Universität Mannheim;Sonderforschungsbereich 504, University of Mannheim.
    20. Arthur Lewbel, 2007. "Estimation of Average Treatment Effects with Misclassification," Econometrica, Econometric Society, vol. 75(2), pages 537-551, March.

    More about this item

    JEL classification:

    • C26 - Mathematical and Quantitative Methods - - Single Equation Models; Single Variables - - - Instrumental Variables (IV) Estimation
    • J40 - Labor and Demographic Economics - - Particular Labor Markets - - - General
    • O30 - Economic Development, Innovation, Technological Change, and Growth - - Innovation; Research and Development; Technological Change; Intellectual Property Rights - - - General

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nbr:nberwo:31844. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: the person in charge (email available below). General contact details of provider: https://edirc.repec.org/data/nberrus.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.