IDEAS home Printed from https://ideas.repec.org/a/eee/csdana/v61y2013icp158-173.html
   My bibliography  Save this article

Estimation of a regression spline sample selection model

Author

Listed:
  • Marra, Giampiero
  • Radice, Rosalba

Abstract

It is often the case that an outcome of interest is observed for a restricted non-randomly selected sample of the population. In such a situation, standard statistical analysis yields biased results. This issue can be addressed using sample selection models which are based on the estimation of two regressions: a binary selection equation determining whether a particular statistical unit will be available in the outcome equation. Classic sample selection models assume a priori that continuous regressors have a pre-specified linear or non-linear relationship to the outcome, which can lead to erroneous conclusions. In the case of continuous response, methods in which covariate effects are modeled flexibly have been previously proposed, the most recent being based on a Bayesian Markov chain Monte Carlo approach. A frequentist counterpart which has the advantage of being computationally fast is introduced. The proposed algorithm is based on the penalized likelihood estimation framework. The construction of confidence intervals is also discussed. The empirical properties of the existing and proposed methods are studied through a simulation study. The approaches are finally illustrated by analyzing data from the RAND Health Insurance Experiment on annual health expenditures.

Suggested Citation

  • Marra, Giampiero & Radice, Rosalba, 2013. "Estimation of a regression spline sample selection model," Computational Statistics & Data Analysis, Elsevier, vol. 61(C), pages 158-173.
  • Handle: RePEc:eee:csdana:v:61:y:2013:i:c:p:158-173
    DOI: 10.1016/j.csda.2012.12.010
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0167947312004446
    Download Restriction: Full text for ScienceDirect subscribers only.

    File URL: https://libkey.io/10.1016/j.csda.2012.12.010?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Omori, Yasuhiro & Miyawaki, Koji, 2010. "Tobit model with covariate dependent thresholds," Computational Statistics & Data Analysis, Elsevier, vol. 54(11), pages 2736-2752, November.
    2. Newey, Whitney K & Powell, James L & Walker, James R, 1990. "Semiparametric Estimation of Selection Models: Some Empirical Results," American Economic Review, American Economic Association, vol. 80(2), pages 324-328, May.
    3. Murray D. Smith, 2003. "Modelling sample selection using Archimedean copulas," Econometrics Journal, Royal Economic Society, vol. 6(1), pages 99-123, June.
    4. Montmarquette, Claude & Mahseredjian, Sophie & Houle, Rachel, 2001. "The determinants of university dropouts: a bivariate probability model with sample selection," Economics of Education Review, Elsevier, vol. 20(5), pages 475-484, October.
    5. Terza, Joseph V., 1998. "Estimating count data models with endogenous switching: Sample selection and endogenous treatment effects," Journal of Econometrics, Elsevier, vol. 84(1), pages 129-154, May.
    6. Toomet, Ott & Henningsen, Arne, 2008. "Sample Selection Models in R: Package sampleSelection," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 27(i07).
    7. Francis Vella, 1998. "Estimating Models with Sample Selection Bias: A Survey," Journal of Human Resources, University of Wisconsin Press, vol. 33(1), pages 127-169.
    8. Siu Fai Leung & Shihti Yu, 2000. "Collinearity and Two-Step Estimation of Sample Selection Models: Problems, Origins, and Remedies," Computational Economics, Springer;Society for Computational Economics, vol. 15(3), pages 173-199, June.
    9. Mitali Das & Whitney K. Newey & Francis Vella, 2003. "Nonparametric Estimation of Sample Selection Models," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 70(1), pages 33-58.
    10. Lee, Lung-Fei, 1984. "Tests for the Bivariate Normal Distribution in Econometric Models with Selectivity," Econometrica, Econometric Society, vol. 52(4), pages 843-863, July.
    11. Simon N. Wood, 2004. "Stable and Efficient Multiple Smoothing Parameter Estimation for Generalized Additive Models," Journal of the American Statistical Association, American Statistical Association, vol. 99, pages 673-686, January.
    12. Manuel Wiesenfarth & Thomas Kneib, 2010. "Bayesian geoadditive sample selection models," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 59(3), pages 381-404, May.
    13. Lee, Lung-fei, 1994. "Semiparametric two-stage estimation of sample selection models subject to Tobit-type selection rules," Journal of Econometrics, Elsevier, vol. 61(2), pages 305-344, April.
    14. Maria Fraga O. Martins, 2001. "Parametric and semiparametric estimation of sample selection models: an empirical application to the female labour force in Portugal," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 16(1), pages 23-39.
    15. Patrick Puhani, 2000. "The Heckman Correction for Sample Selection and Its Critique," Journal of Economic Surveys, Wiley Blackwell, vol. 14(1), pages 53-68, February.
    16. Ruppert,David & Wand,M. P. & Carroll,R. J., 2003. "Semiparametric Regression," Cambridge Books, Cambridge University Press, number 9780521785167.
    17. Li, Phillip, 2011. "Estimation of sample selection models with two selection mechanisms," Computational Statistics & Data Analysis, Elsevier, vol. 55(2), pages 1099-1108, February.
    18. Ahn, Hyungtaik & Powell, James L., 1993. "Semiparametric estimation of censored selection models with a nonparametric selection mechanism," Journal of Econometrics, Elsevier, vol. 58(1-2), pages 3-29, July.
    19. Mealli, Fabrizia & Pacini, Barbara, 2008. "Comparing principal stratification and selection models in parametric causal inference with nonignorable missingness," Computational Statistics & Data Analysis, Elsevier, vol. 53(2), pages 507-516, December.
    20. Heckman, James, 2013. "Sample selection bias as a specification error," Applied Econometrics, Russian Presidential Academy of National Economy and Public Administration (RANEPA), vol. 31(3), pages 129-137.
    21. Sigelman, Lee & Zeng, Langche, 1999. "Analyzing Censored and Sample-Selected Data with Tobit and Heckit Models," Political Analysis, Cambridge University Press, vol. 8(2), pages 167-182, December.
    22. Boyes, William J. & Hoffman, Dennis L. & Low, Stuart A., 1989. "An econometric analysis of the bank credit scoring problem," Journal of Econometrics, Elsevier, vol. 40(1), pages 3-14, January.
    23. van Hasselt, Martijn, 2011. "Bayesian inference in a sample selection model," Journal of Econometrics, Elsevier, vol. 165(2), pages 221-232.
    24. Yee, Thomas W., 2010. "The VGAM Package for Categorical Data Analysis," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 32(i10).
    25. Ruppert,David & Wand,M. P. & Carroll,R. J., 2003. "Semiparametric Regression," Cambridge Books, Cambridge University Press, number 9780521780506.
    26. Philip T. Reiss & R. Todd Ogden, 2009. "Smoothing parameter selection for a class of semiparametric linear models," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 71(2), pages 505-523, April.
    27. Giampiero Marra & Simon N. Wood, 2012. "Coverage Properties of Confidence Intervals for Generalized Additive Model Components," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 39(1), pages 53-74, March.
    28. Inyoung Kim & Noah D. Cohen & Raymond J. Carroll, 2003. "Semiparametric Regression Splines in Matched Case-Control Studies," Biometrics, The International Biometric Society, vol. 59(4), pages 1158-1169, December.
    29. Yulia V. Marchenko & Marc G. Genton, 2012. "A Heckman Selection- t Model," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 107(497), pages 304-317, March.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Hajime Seya & Junyi Zhang & Makoto Chikaraishi & Ying Jiang, 2020. "Decisions on truck parking place and time on expressways: an analysis using digital tachograph data," Transportation, Springer, vol. 47(2), pages 555-583, April.
    2. Marra, Giampiero & Wyszynski, Karol, 2016. "Semi-parametric copula sample selection models for count responses," Computational Statistics & Data Analysis, Elsevier, vol. 104(C), pages 110-129.
    3. Karol Wyszynski & Giampiero Marra, 2018. "Sample selection models for count data in R," Computational Statistics, Springer, vol. 33(3), pages 1385-1412, September.
    4. Wiemann, Paul F.V. & Klein, Nadja & Kneib, Thomas, 2022. "Correcting for sample selection bias in Bayesian distributional regression models," Computational Statistics & Data Analysis, Elsevier, vol. 168(C).
    5. Pierfrancesco Alaimo Di Loro & Daria Scacciatelli & Giovanna Tagliaferri, 2023. "2-step Gradient Boosting approach to selectivity bias correction in tax audit: an application to the VAT gap in Italy," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 32(1), pages 237-270, March.
    6. Sengupta, Reshmi & Rooj, Debasis, 2019. "The effect of health insurance on hospitalization: Identification of adverse selection, moral hazard and the vulnerable population in the Indian healthcare market," World Development, Elsevier, vol. 122(C), pages 110-129.
    7. Mikhail Zhelonkin & Marc G. Genton & Elvezio Ronchetti, 2016. "Robust inference in sample selection models," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 78(4), pages 805-827, September.
    8. Hiroaki Kaido & Kaspar Wüthrich, 2021. "Decentralization estimators for instrumental variable quantile regression models," Quantitative Economics, Econometric Society, vol. 12(2), pages 443-475, May.
    9. Wojtyś, Małgorzata & Marra, Giampiero & Radice, Rosalba, 2018. "Copula based generalized additive models for location, scale and shape with non-random sample selection," Computational Statistics & Data Analysis, Elsevier, vol. 127(C), pages 1-14.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Marra, Giampiero & Wyszynski, Karol, 2016. "Semi-parametric copula sample selection models for count responses," Computational Statistics & Data Analysis, Elsevier, vol. 104(C), pages 110-129.
    2. Karol Wyszynski & Giampiero Marra, 2018. "Sample selection models for count data in R," Computational Statistics, Springer, vol. 33(3), pages 1385-1412, September.
    3. Mikhail Zhelonkin & Marc G. Genton & Elvezio Ronchetti, 2016. "Robust inference in sample selection models," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 78(4), pages 805-827, September.
    4. Wojtyś, Małgorzata & Marra, Giampiero & Radice, Rosalba, 2018. "Copula based generalized additive models for location, scale and shape with non-random sample selection," Computational Statistics & Data Analysis, Elsevier, vol. 127(C), pages 1-14.
    5. Giampiero Marra & Rosalba Radice & Till Bärnighausen & Simon N. Wood & Mark E. McGovern, 2017. "A Simultaneous Equation Approach to Estimating HIV Prevalence With Nonignorable Missing Responses," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 112(518), pages 484-496, April.
    6. Wiemann, Paul F.V. & Klein, Nadja & Kneib, Thomas, 2022. "Correcting for sample selection bias in Bayesian distributional regression models," Computational Statistics & Data Analysis, Elsevier, vol. 168(C).
    7. Wojtyś, Magorzata & Marra, Giampiero & Radice, Rosalba, 2016. "Copula Regression Spline Sample Selection Models: The R Package SemiParSampleSel," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 71(i06).
    8. Emmanuel O. Ogundimu & Jane L. Hutton, 2016. "A Sample Selection Model with Skew-normal Distribution," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 43(1), pages 172-190, March.
    9. Claudia PIGINI, 2012. "Of Butterflies and Caterpillars: Bivariate Normality in the Sample Selection Model," Working Papers 377, Universita' Politecnica delle Marche (I), Dipartimento di Scienze Economiche e Sociali.
    10. Victor Chernozhukov & Ivan Fernandez-Val & Siyi Luo, 2018. "Distribution regression with sample selection, with an application to wage decompositions in the UK," CeMMAP working papers CWP68/18, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    11. Øystein Sørensen & Anders M. Fjell & Kristine B. Walhovd, 2023. "Longitudinal Modeling of Age-Dependent Latent Traits with Generalized Additive Latent and Mixed Models," Psychometrika, Springer;The Psychometric Society, vol. 88(2), pages 456-486, June.
    12. Hasebe, Takuya & Vijverberg, Wim P., 2012. "A Flexible Sample Selection Model: A GTL-Copula Approach," IZA Discussion Papers 7003, Institute of Labor Economics (IZA).
    13. Ding, Peng, 2014. "Bayesian robust inference of sample selection using selection-t models," Journal of Multivariate Analysis, Elsevier, vol. 124(C), pages 451-464.
    14. Liu, Ruixuan & Yu, Zhengfei, 2022. "Sample selection models with monotone control functions," Journal of Econometrics, Elsevier, vol. 226(2), pages 321-342.
    15. Manuel Arellano & Stéphane Bonhomme, 2017. "Quantile Selection Models With an Application to Understanding Changes in Wage Inequality," Econometrica, Econometric Society, vol. 85, pages 1-28, January.
    16. Guilhem Bascle, 2008. "Controlling for endogeneity with instrumental variables in strategic management research," Post-Print hal-00576795, HAL.
    17. McGovern, Mark E. & Canning, David & Bärnighausen, Till, 2018. "Accounting for non-response bias using participation incentives and survey design: An application using gift vouchers," Economics Letters, Elsevier, vol. 171(C), pages 239-244.
    18. Ibáñez, Ana María & Muñoz, Juan Carlos & Verwimp, Philip, 2013. "Abandoning Coffee under the Threat of Violence and the Presence of Illicit Crops. Evidence from Colombia," Documentos CEDE Series 161356, Universidad de Los Andes, Economics Department.
    19. Wahba, Jackline & Schluter, Christian, 2009. "Illegal migration, wages and remittances- semi-parametric estimation of illegality effects," Discussion Paper Series In Economics And Econometrics 913, Economics Division, School of Social Sciences, University of Southampton.
    20. Lewbel, Arthur, 2007. "Endogenous selection or treatment model estimation," Journal of Econometrics, Elsevier, vol. 141(2), pages 777-806, December.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:csdana:v:61:y:2013:i:c:p:158-173. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/csda .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.