IDEAS home Printed from https://ideas.repec.org/a/bla/scjsta/v42y2015i1p155-179.html
   My bibliography  Save this article

Enriching Surveys with Supplementary Data and its Application to Studying Wage Regression

Author

Listed:
  • Denis Heng Yan Leung
  • Ken Yamada
  • Biao Zhang

Abstract

type="main" xml:id="sjos12100-abs-0001"> We consider the problem of supplementing survey data with additional information from a population. The framework we use is very general; examples are missing data problems, measurement error models and combining data from multiple surveys. We do not require the survey data to be a simple random sample of the population of interest. The key assumption we make is that there exists a set of common variables between the survey and the supplementary data. Thus, the supplementary data serve the dual role of providing adjustments to the survey data for model consistencies and also enriching the survey data for improved efficiency. We propose a semi-parametric approach using empirical likelihood to combine data from the two sources. The method possesses favourable large and moderate sample properties. We use the method to investigate wage regression using data from the National Longitudinal Survey of Youth Study.

Suggested Citation

  • Denis Heng Yan Leung & Ken Yamada & Biao Zhang, 2015. "Enriching Surveys with Supplementary Data and its Application to Studying Wage Regression," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 42(1), pages 155-179, March.
  • Handle: RePEc:bla:scjsta:v:42:y:2015:i:1:p:155-179
    as

    Download full text from publisher

    File URL: http://hdl.handle.net/10.1111/sjos.12100
    Download Restriction: Access to full text is restricted to subscribers.
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Takis Merkouris, 2004. "Combining Independent Regression Estimators From Multiple Surveys," Journal of the American Statistical Association, American Statistical Association, vol. 99, pages 1131-1139, December.
    2. Imbens, Guido W, 1992. "An Efficient Method of Moments Estimator for Discrete Choice Models with Choice-Based Sampling," Econometrica, Econometric Society, vol. 60(5), pages 1187-1214, September.
    3. Hansen, Lars Peter, 1982. "Large Sample Properties of Generalized Method of Moments Estimators," Econometrica, Econometric Society, vol. 50(4), pages 1029-1054, July.
    4. Tarozzi, Alessandro, 2007. "Calculating Comparable Statistics From Incomparable Surveys, With an Application to Poverty in India," Journal of Business & Economic Statistics, American Statistical Association, vol. 25, pages 314-336, July.
    5. J. Chen, 2002. "Using empirical likelihood methods to obtain range restricted weights in regression estimators for surveys," Biometrika, Biometrika Trust, vol. 89(1), pages 230-237, March.
    6. Guido W. Imbens & Tony Lancaster, 1994. "Combining Micro and Macro Data in Microeconometric Models," Review of Economic Studies, Oxford University Press, vol. 61(4), pages 655-680.
    7. Judith K. Hellerstein & Guido W. Imbens, 1999. "Imposing Moment Restrictions From Auxiliary Data By Weighting," The Review of Economics and Statistics, MIT Press, vol. 81(1), pages 1-14, February.
    8. Nevo, Aviv, 2003. "Using Weights to Adjust for Sample Selection When Auxiliary Information Is Available," Journal of Business & Economic Statistics, American Statistical Association, vol. 21(1), pages 43-52, January.
    9. Wooldridge, Jeffrey M., 2007. "Inverse probability weighted estimation for general missing data problems," Journal of Econometrics, Elsevier, vol. 141(2), pages 1281-1301, December.
    10. Keisuke Hirano & Guido W. Imbens & Geert Ridder, 2003. "Efficient Estimation of Average Treatment Effects Using the Estimated Propensity Score," Econometrica, Econometric Society, vol. 71(4), pages 1161-1189, July.
    11. Qihua Wang & J. N. K. Rao, 2002. "Empirical Likelihood‐based Inference in Linear Models with Missing Data," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 29(3), pages 563-576, September.
    12. Chris Skinner & Nigel Stuttard & Gabriele Beissel-Durrant & James Jenkins, 2002. "The Measurement of Low Pay in the UK Labour Force Survey," Oxford Bulletin of Economics and Statistics, Department of Economics, University of Oxford, vol. 64(s1), pages 653-676, August.
    13. Bryan S. Graham & Cristine Campos De Xavier Pinto & Daniel Egel, 2012. "Inverse Probability Tilting for Moment Condition Models with Missing Data," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 79(3), pages 1053-1079.
    14. Wang, C. Y. & Wang, Suojin & Carroll, R. J., 1997. "Estimation in choice-based sampling with measurement error and bootstrap analysis," Journal of Econometrics, Elsevier, vol. 77(1), pages 65-86, March.
    15. Yi‐Hau Chen & Hung Chen, 2000. "A unified approach to regression analysis under double‐sampling designs," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 62(3), pages 449-460.
    16. Manuel Arellano & Costas Meghir, 1992. "Female Labour Supply and On-the-Job Search: An Empirical Model Estimated Using Complementary Data Sets," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 59(3), pages 537-559.
    17. Qihua Wang, 2002. "Empirical likelihood-based inference in linear errors-in-covariables models with validation data," Biometrika, Biometrika Trust, vol. 89(2), pages 345-358, June.
    18. Newey, Whitney K, 1990. "Semiparametric Efficiency Bounds," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 5(2), pages 99-135, April-Jun.
    19. Abowd J.M. & Crepon B. & Kramarz F., 2001. "Moment Estimation With Attrition: An Application to Economic Models," Journal of the American Statistical Association, American Statistical Association, vol. 96, pages 1223-1231, December.
    20. Chen S.X. & Leung D.H.Y. & Qin J., 2003. "Information Recovery in a Study With Surrogate Endpoints," Journal of the American Statistical Association, American Statistical Association, vol. 98, pages 1052-1062, January.
    21. Song Xi Chen & Hengjian Cui, 2006. "On Bartlett correction of empirical likelihood in the presence of nuisance parameters," Biometrika, Biometrika Trust, vol. 93(1), pages 215-220, March.
    22. Lusardi, Annamaria, 1996. "Permanent Income, Current Income, and Consumption: Evidence from Two Panel Data Sets," Journal of Business & Economic Statistics, American Statistical Association, vol. 14(1), pages 81-90, January.
    23. Xiaohong Chen & Han Hong & Elie Tamer, 2005. "Measurement Error Models with Auxiliary Data," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 72(2), pages 343-366.
    24. Skinner, Chris, et al, 2002. "The Measurement of Low Pay in the UK Labour Force Survey," Oxford Bulletin of Economics and Statistics, Department of Economics, University of Oxford, vol. 64(0), pages 653-676, Supplemen.
    25. Song Xi Chen & Denis H. Y. Leung & Jing Qin, 2008. "Improving semiparametric estimation by using surrogate data," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 70(4), pages 803-823, September.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Bryan S. Graham & Cristine Campos De Xavier Pinto & Daniel Egel, 2012. "Inverse Probability Tilting for Moment Condition Models with Missing Data," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 79(3), pages 1053-1079.
    2. Prokhorov, Artem & Schmidt, Peter, 2009. "GMM redundancy results for general missing data problems," Journal of Econometrics, Elsevier, vol. 151(1), pages 47-55, July.
    3. Buchinsky, Moshe & Li, Fanghua & Liao, Zhipeng, 2022. "Estimation and inference of semiparametric models using data from several sources," Journal of Econometrics, Elsevier, vol. 226(1), pages 80-103.
    4. Esmeralda A. Ramalho & Richard J. Smith, 2013. "Discrete Choice Non-Response," Review of Economic Studies, Oxford University Press, vol. 80(1), pages 343-364.
    5. Devereux, Paul J. & Tripathi, Gautam, 2009. "Optimally combining censored and uncensored datasets," Journal of Econometrics, Elsevier, vol. 151(1), pages 17-32, July.
    6. Biao Zhang, 2016. "Empirical Likelihood in Causal Inference," Econometric Reviews, Taylor & Francis Journals, vol. 35(2), pages 201-231, February.
    7. Lancaster, Tony & Imbens, Guido, 1996. "Case-control studies with contaminated controls," Journal of Econometrics, Elsevier, vol. 71(1-2), pages 145-160.
    8. Firpo, Sergio Pinheiro & Pinto, Rafael de Carvalho Cayres, 2012. "Combining Strategies for the Estimation of Treatment Effects," Brazilian Review of Econometrics, Sociedade Brasileira de Econometria - SBE, vol. 32(1), March.
    9. Guell, Maia & Hu, Luojia, 2006. "Estimating the probability of leaving unemployment using uncompleted spells from repeated cross-section data," Journal of Econometrics, Elsevier, vol. 133(1), pages 307-341, July.
    10. Graham, Bryan S. & Pinto, Cristine Campos de Xavier, 2022. "Semiparametrically efficient estimation of the average linear regression function," Journal of Econometrics, Elsevier, vol. 226(1), pages 115-138.
    11. d'Haultfoeuille, Xavier, 2010. "A new instrumental method for dealing with endogenous selection," Journal of Econometrics, Elsevier, vol. 154(1), pages 1-15, January.
    12. Joachim Inkmann, 2010. "Estimating Firm Size Elasticities of Product and Process R&D," Economica, London School of Economics and Political Science, vol. 77(306), pages 384-402, April.
    13. Inkmann, J., 2005. "Inverse Probability Weighted Generalised Empirical Likelihood Estimators : Firm Size and R&D Revisited," Other publications TiSEM c39cff1f-16c1-4446-a83f-c, Tilburg University, School of Economics and Management.
    14. Nail Kashaev, 2022. "Estimation of Parametric Binary Outcome Models with Degenerate Pure Choice-Based Data with Application to COVID-19-Positive Tests from British Columbia," University of Western Ontario, Departmental Research Report Series 20225, University of Western Ontario, Department of Economics.
    15. Bryan S. Graham & Cristine Campos de Xavier Pinto & Daniel Egel, 2016. "Efficient Estimation of Data Combination Models by the Method of Auxiliary-to-Study Tilting (AST)," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 34(2), pages 288-301, April.
    16. Chen, Xiaohong, 2007. "Large Sample Sieve Estimation of Semi-Nonparametric Models," Handbook of Econometrics, in: J.J. Heckman & E.E. Leamer (ed.), Handbook of Econometrics, edition 1, volume 6, chapter 76, Elsevier.
    17. Cattaneo, Matias D., 2010. "Efficient semiparametric estimation of multi-valued treatment effects under ignorability," Journal of Econometrics, Elsevier, vol. 155(2), pages 138-154, April.
    18. Nevo, Aviv, 2003. "Using Weights to Adjust for Sample Selection When Auxiliary Information Is Available," Journal of Business & Economic Statistics, American Statistical Association, vol. 21(1), pages 43-52, January.
    19. Hirukawa, Masayuki & Prokhorov, Artem, 2018. "Consistent estimation of linear regression models using matched data," Journal of Econometrics, Elsevier, vol. 203(2), pages 344-358.
    20. Song Xi Chen & Denis H. Y. Leung & Jing Qin, 2008. "Improving semiparametric estimation by using surrogate data," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 70(4), pages 803-823, September.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bla:scjsta:v:42:y:2015:i:1:p:155-179. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Wiley Content Delivery (email available below). General contact details of provider: http://www.blackwellpublishing.com/journal.asp?ref=0303-6898 .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.