Semiparametric regression in Stata
Semiparametric regression deals with the introduction of some very general nonlinear functional forms in regression analyses. This class of regression models is generally used to fit a parametric model in which the functional form of a subset of the explanatory variables is not known and/or in which the distribution of the error term cannot be assumed of being of a specific type beforehand. To fix ideas, consider the partial linear model y = zb + f(x) + e, in which the shape of the potentially nonlinear function of predictor x is of particular interest. Two approaches to modeling f(x) are to use splines or fractional polynomials. This talk reviews other more general approaches, and the commands available in Stata to fit such models. The main topic of the talk will be partial linear regression models, with some brief discussion also of so-called single index and generalized additive models. Though several semiparametric regression methods have been proposed and developed in the literature, these are probably the most popular ones. The general idea of partial linear regression models is that a dependent variable is regressed on i) a set of explanatory variables entering the model linearly and ii) a set of variables entering the model nonlinearly but without assuming any specific functional form. Several estimators have been proposed in the literature and are available in Stata. For example, the semipar command makes available what is called the double residuals estimator introduced by Robinson (1988), which is consistent and efficient. Similarly, the plreg command fits an alternative difference-based estimator proposed by Yatchew (1998) that has similar statistical properties to Robinsonâ€™s estimator. These estimators will be briefly compared to identify some drawbacks and pitfalls of both methods. A natural concern of researchers is how these estimators could be modified to deal with heteroskedasticity, serial correlation, and endogeneity in cross-sectional data or how they could be adapted in the context of panel data to control for unobserved heterogeneity. As a consequence, a substantial part of the talk will be devoted to explaining i) how the plreg and semipar commands can be used to tackle these very common violations of the Gaussâ€“Markov assumptions in cross-sectional data and ii) how the user-written xtsemipar command makes a semiparametric regression easy to fit in the context of panel data. Because it is sometimes possible to move toward pure parametric models, a test proposed by Hardle and Mammen (1993) and built to check whether the nonparametric fit can be satisfactorily approximated by a parametric polynomial adjustment of order p will be described.
References listed on IDEAS
Please report citation or reference errors to , or , if you are the registered author of the cited work, log in to your RePEc Author Service profile, click on "citations" and make appropriate adjustments.:
- Vincenzo Verardi & Nicolas Debarsy, 2012.
"Robinson's square root of N consistent semiparametric regression estimator in Stata,"
StataCorp LP, vol. 12(4), pages 726-735, December.
- Vincenzo Verardi & Nicolas Debarsy, 2011. "Robinson's Squareroot-of-n-consistent Semiparametric Regression Estimator in Stata," Working Papers 1110, University of Namur, Department of Economics.
- Bruffaerts, Christopher & Verardi, Vincenzo & Vermandele, Catherine, 2014. "A generalized boxplot for skewed and heavy-tailed distributions," Statistics & Probability Letters, Elsevier, vol. 95(C), pages 110-117.
- Vincenzo Verardi, 2013. "Semiparametric regression in Stata," United Kingdom Stata Users' Group Meetings 2013 14, Stata Users Group.
- Vincenzo Verardi, 2014. "Semiparametric regression in Stata," United Kingdom Stata Users' Group Meetings 2014 09, Stata Users Group.
- Adonis Yatchew, 1998. "Nonparametric Regression Techniques in Economics," Journal of Economic Literature, American Economic Association, vol. 36(2), pages 669-721, June.
- Yatchew,Adonis, 2003. "Semiparametric Regression for the Applied Econometrician," Cambridge Books, Cambridge University Press, number 9780521012263, December.
- Yatchew,Adonis, 2003. "Semiparametric Regression for the Applied Econometrician," Cambridge Books, Cambridge University Press, number 9780521812832, December.
- Hubert, M. & Vandervieren, E., 2008. "An adjusted boxplot for skewed distributions," Computational Statistics & Data Analysis, Elsevier, vol. 52(12), pages 5186-5201, August. Full references (including those not matched with items on IDEAS)