IDEAS home Printed from https://ideas.repec.org/a/eee/csdana/v61y2013icp158-173.html
   My bibliography  Save this article

Estimation of a regression spline sample selection model

Author

Listed:
  • Marra, Giampiero
  • Radice, Rosalba

Abstract

It is often the case that an outcome of interest is observed for a restricted non-randomly selected sample of the population. In such a situation, standard statistical analysis yields biased results. This issue can be addressed using sample selection models which are based on the estimation of two regressions: a binary selection equation determining whether a particular statistical unit will be available in the outcome equation. Classic sample selection models assume a priori that continuous regressors have a pre-specified linear or non-linear relationship to the outcome, which can lead to erroneous conclusions. In the case of continuous response, methods in which covariate effects are modeled flexibly have been previously proposed, the most recent being based on a Bayesian Markov chain Monte Carlo approach. A frequentist counterpart which has the advantage of being computationally fast is introduced. The proposed algorithm is based on the penalized likelihood estimation framework. The construction of confidence intervals is also discussed. The empirical properties of the existing and proposed methods are studied through a simulation study. The approaches are finally illustrated by analyzing data from the RAND Health Insurance Experiment on annual health expenditures.

Suggested Citation

  • Marra, Giampiero & Radice, Rosalba, 2013. "Estimation of a regression spline sample selection model," Computational Statistics & Data Analysis, Elsevier, vol. 61(C), pages 158-173.
  • Handle: RePEc:eee:csdana:v:61:y:2013:i:c:p:158-173
    DOI: 10.1016/j.csda.2012.12.010
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0167947312004446
    Download Restriction: Full text for ScienceDirect subscribers only.

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Omori, Yasuhiro & Miyawaki, Koji, 2010. "Tobit model with covariate dependent thresholds," Computational Statistics & Data Analysis, Elsevier, vol. 54(11), pages 2736-2752, November.
    2. Newey, Whitney K & Powell, James L & Walker, James R, 1990. "Semiparametric Estimation of Selection Models: Some Empirical Results," American Economic Review, American Economic Association, vol. 80(2), pages 324-328, May.
    3. Murray D. Smith, 2003. "Modelling sample selection using Archimedean copulas," Econometrics Journal, Royal Economic Society, vol. 6(1), pages 99-123, June.
    4. Montmarquette, Claude & Mahseredjian, Sophie & Houle, Rachel, 2001. "The determinants of university dropouts: a bivariate probability model with sample selection," Economics of Education Review, Elsevier, vol. 20(5), pages 475-484, October.
    5. Terza, Joseph V., 1998. "Estimating count data models with endogenous switching: Sample selection and endogenous treatment effects," Journal of Econometrics, Elsevier, vol. 84(1), pages 129-154, May.
    6. Toomet, Ott & Henningsen, Arne, 2008. "Sample Selection Models in R: Package sampleSelection," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 27(i07).
    7. Francis Vella, 1998. "Estimating Models with Sample Selection Bias: A Survey," Journal of Human Resources, University of Wisconsin Press, vol. 33(1), pages 127-169.
    8. Siu Fai Leung & Shihti Yu, 2000. "Collinearity and Two-Step Estimation of Sample Selection Models: Problems, Origins, and Remedies," Computational Economics, Springer;Society for Computational Economics, vol. 15(3), pages 173-199, June.
    9. Mitali Das & Whitney K. Newey & Francis Vella, 2003. "Nonparametric Estimation of Sample Selection Models," Review of Economic Studies, Oxford University Press, vol. 70(1), pages 33-58.
    10. Simon N. Wood, 2004. "Stable and Efficient Multiple Smoothing Parameter Estimation for Generalized Additive Models," Journal of the American Statistical Association, American Statistical Association, vol. 99, pages 673-686, January.
    11. Lee, Lung-Fei, 1984. "Tests for the Bivariate Normal Distribution in Econometric Models with Selectivity," Econometrica, Econometric Society, vol. 52(4), pages 843-863, July.
    12. Manuel Wiesenfarth & Thomas Kneib, 2010. "Bayesian geoadditive sample selection models," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 59(3), pages 381-404, May.
    13. Lee, Lung-fei, 1994. "Semiparametric two-stage estimation of sample selection models subject to Tobit-type selection rules," Journal of Econometrics, Elsevier, vol. 61(2), pages 305-344, April.
    14. Ruppert,David & Wand,M. P. & Carroll,R. J., 2003. "Semiparametric Regression," Cambridge Books, Cambridge University Press, number 9780521785167, July - De.
    15. Li, Phillip, 2011. "Estimation of sample selection models with two selection mechanisms," Computational Statistics & Data Analysis, Elsevier, vol. 55(2), pages 1099-1108, February.
    16. Ahn, H. & Powell, J.L., 1990. "Semiparametric Estimation Of Censored Selection Models With A Nonparametric Selection Mechanism," Working papers 90-33, Wisconsin Madison - Social Systems.
    17. Mealli, Fabrizia & Pacini, Barbara, 2008. "Comparing principal stratification and selection models in parametric causal inference with nonignorable missingness," Computational Statistics & Data Analysis, Elsevier, vol. 53(2), pages 507-516, December.
    18. Ahn, Hyungtaik & Powell, James L., 1993. "Semiparametric estimation of censored selection models with a nonparametric selection mechanism," Journal of Econometrics, Elsevier, vol. 58(1-2), pages 3-29, July.
    19. Puhani, Patrick A, 2000. " The Heckman Correction for Sample Selection and Its Critique," Journal of Economic Surveys, Wiley Blackwell, vol. 14(1), pages 53-68, February.
    20. Heckman, James, 2013. "Sample selection bias as a specification error," Applied Econometrics, Publishing House "SINERGIA PRESS", vol. 31(3), pages 129-137.
    21. Sigelman, Lee & Zeng, Langche, 1999. "Analyzing Censored and Sample-Selected Data with Tobit and Heckit Models," Political Analysis, Cambridge University Press, vol. 8(02), pages 167-182, December.
    22. Boyes, William J. & Hoffman, Dennis L. & Low, Stuart A., 1989. "An econometric analysis of the bank credit scoring problem," Journal of Econometrics, Elsevier, vol. 40(1), pages 3-14, January.
    23. van Hasselt, Martijn, 2011. "Bayesian inference in a sample selection model," Journal of Econometrics, Elsevier, vol. 165(2), pages 221-232.
    24. Yee, Thomas W., 2010. "The VGAM Package for Categorical Data Analysis," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 32(i10).
    25. Ruppert,David & Wand,M. P. & Carroll,R. J., 2003. "Semiparametric Regression," Cambridge Books, Cambridge University Press, number 9780521780506, July - De.
    26. Philip T. Reiss & R. Todd Ogden, 2009. "Smoothing parameter selection for a class of semiparametric linear models," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 71(2), pages 505-523, April.
    27. Giampiero Marra & Simon N. Wood, 2012. "Coverage Properties of Confidence Intervals for Generalized Additive Model Components," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 39(1), pages 53-74, March.
    28. Inyoung Kim & Noah D. Cohen & Raymond J. Carroll, 2003. "Semiparametric Regression Splines in Matched Case-Control Studies," Biometrics, The International Biometric Society, vol. 59(4), pages 1158-1169, December.
    29. Yulia V. Marchenko & Marc G. Genton, 2012. "A Heckman Selection- t Model," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 107(497), pages 304-317, March.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Marra, Giampiero & Wyszynski, Karol, 2016. "Semi-parametric copula sample selection models for count responses," Computational Statistics & Data Analysis, Elsevier, vol. 104(C), pages 110-129.
    2. repec:spr:compst:v:33:y:2018:i:3:d:10.1007_s00180-017-0762-y is not listed on IDEAS
    3. Mikhail Zhelonkin & Marc G. Genton & Elvezio Ronchetti, 2016. "Robust inference in sample selection models," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 78(4), pages 805-827, September.
    4. repec:eee:csdana:v:127:y:2018:i:c:p:1-14 is not listed on IDEAS

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:csdana:v:61:y:2013:i:c:p:158-173. See general information about how to correct material in RePEc.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: (Dana Niculescu). General contact details of provider: http://www.elsevier.com/locate/csda .

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service hosted by the Research Division of the Federal Reserve Bank of St. Louis . RePEc uses bibliographic data supplied by the respective publishers.