Comparing penalized splines and fractional polynomials for flexible modelling of the effects of continuous predictor variables
P(enalized)-splines and fractional polynomials (FPs) have emerged as powerful smoothing techniques with increasing popularity in applied research. Both approaches provide considerable flexibility, but only limited comparative evaluations of the performance and properties of the two methods have been conducted to date. Extensive simulations are performed to compare FPs of degree 2 (FP2) and degree 4 (FP4) and two variants of P-splines that used generalized cross validation (GCV) and restricted maximum likelihood (REML) for smoothing parameter selection. The ability of P-splines and FPs to recover the "true" functional form of the association between continuous, binary and survival outcomes and exposure for linear, quadratic and more complex, non-linear functions, using different sample sizes and signal to noise ratios is evaluated. For more curved functions FP2, the current default setting in implementations for fitting FPs in R, STATA and SAS, showed considerable bias and consistently higher mean squared error (MSE) compared to spline-based estimators and FP4, that performed equally well in most simulation settings. FPs however, are prone to artefacts due to the specific choice of the origin, while P-splines based on GCV reveal sometimes wiggly estimates in particular for small sample sizes. Application to a real dataset illustrates the different features of the two approaches.
If you experience problems downloading a file, check if you have the proper application to view it first. In case of further problems read the IDEAS help page. Note that these files are not on the IDEAS site. Please be patient as the files may be large.
As the access to this document is restricted, you may want to look for a different version under "Related research" (further below) or search for a different version of it.
References listed on IDEAS
Please report citation or reference errors to , or , if you are the registered author of the cited work, log in to your RePEc Author Service profile, click on "citations" and make appropriate adjustments.:
- Ruppert,David & Wand,M. P. & Carroll,R. J., 2003. "Semiparametric Regression," Cambridge Books, Cambridge University Press, number 9780521785167, February.
- Sauerbrei, W. & Meier-Hirmer, C. & Benner, A. & Royston, P., 2006. "Multivariable regression model building by using fractional polynomials: Description of SAS, STATA and R programs," Computational Statistics & Data Analysis, Elsevier, vol. 50(12), pages 3464-3485, August.
- Brezger, Andreas & Kneib, Thomas & Lang, Stefan, 2005. "BayesX: Analyzing Bayesian Structural Additive Regression Models," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 14(i11).
- Marx, Brian D. & Eilers, Paul H. C., 1998. "Direct generalized additive modeling with penalized likelihood," Computational Statistics & Data Analysis, Elsevier, vol. 28(2), pages 193-209, August.
- Jullion, Astrid & Lambert, Philippe, 2007. "Robust specification of the roughness penalty prior distribution in spatially adaptive Bayesian P-splines models," Computational Statistics & Data Analysis, Elsevier, vol. 51(5), pages 2542-2558, February.
- S. N. Wood, 2000. "Modelling and smoothing parameter estimation with multiple quadratic penalties," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 62(2), pages 413-428.
- Simon N. Wood, 2008. "Fast stable direct fitting and smoothness selection for generalized additive models," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 70(3), pages 495-518.
- Göran Kauermann & Tatyana Krivobokova & Ludwig Fahrmeir, 2009. "Some asymptotic results on generalized penalized spline smoothing," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 71(2), pages 487-503.
- Simon N. Wood, 2004. "Stable and Efficient Multiple Smoothing Parameter Estimation for Generalized Additive Models," Journal of the American Statistical Association, American Statistical Association, vol. 99, pages 673-686, January.
- Belitz, Christiane & Lang, Stefan, 2008. "Simultaneous selection of variables and smoothing parameters in structured additive regression models," Computational Statistics & Data Analysis, Elsevier, vol. 53(1), pages 61-81, September.
- Brezger, Andreas & Lang, Stefan, 2006. "Generalized structured additive regression based on Bayesian P-splines," Computational Statistics & Data Analysis, Elsevier, vol. 50(4), pages 967-991, February.
- Ruppert,David & Wand,M. P. & Carroll,R. J., 2003. "Semiparametric Regression," Cambridge Books, Cambridge University Press, number 9780521780506, February.
- W. Sauerbrei & P. Royston, 1999. "Building multivariable prognostic and diagnostic models: transformation of the predictors by using fractional polynomials," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 162(1), pages 71-94.