IDEAS home Printed from
MyIDEAS: Login to save this article or follow this journal

Comparing penalized splines and fractional polynomials for flexible modelling of the effects of continuous predictor variables

  • Strasak, Alexander M.
  • Umlauf, Nikolaus
  • Pfeiffer, Ruth M.
  • Lang, Stefan
Registered author(s):

    P(enalized)-splines and fractional polynomials (FPs) have emerged as powerful smoothing techniques with increasing popularity in applied research. Both approaches provide considerable flexibility, but only limited comparative evaluations of the performance and properties of the two methods have been conducted to date. Extensive simulations are performed to compare FPs of degree 2 (FP2) and degree 4 (FP4) and two variants of P-splines that used generalized cross validation (GCV) and restricted maximum likelihood (REML) for smoothing parameter selection. The ability of P-splines and FPs to recover the "true" functional form of the association between continuous, binary and survival outcomes and exposure for linear, quadratic and more complex, non-linear functions, using different sample sizes and signal to noise ratios is evaluated. For more curved functions FP2, the current default setting in implementations for fitting FPs in R, STATA and SAS, showed considerable bias and consistently higher mean squared error (MSE) compared to spline-based estimators and FP4, that performed equally well in most simulation settings. FPs however, are prone to artefacts due to the specific choice of the origin, while P-splines based on GCV reveal sometimes wiggly estimates in particular for small sample sizes. Application to a real dataset illustrates the different features of the two approaches.

    If you experience problems downloading a file, check if you have the proper application to view it first. In case of further problems read the IDEAS help page. Note that these files are not on the IDEAS site. Please be patient as the files may be large.

    File URL:
    Download Restriction: Full text for ScienceDirect subscribers only.

    As the access to this document is restricted, you may want to look for a different version under "Related research" (further below) or search for a different version of it.

    Article provided by Elsevier in its journal Computational Statistics & Data Analysis.

    Volume (Year): 55 (2011)
    Issue (Month): 4 (April)
    Pages: 1540-1551

    in new window

    Handle: RePEc:eee:csdana:v:55:y:2011:i:4:p:1540-1551
    Contact details of provider: Web page:

    References listed on IDEAS
    Please report citation or reference errors to , or , if you are the registered author of the cited work, log in to your RePEc Author Service profile, click on "citations" and make appropriate adjustments.:

    as in new window
    1. S. N. Wood, 2000. "Modelling and smoothing parameter estimation with multiple quadratic penalties," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 62(2), pages 413-428.
    2. Jullion, Astrid & Lambert, Philippe, 2007. "Robust specification of the roughness penalty prior distribution in spatially adaptive Bayesian P-splines models," Computational Statistics & Data Analysis, Elsevier, vol. 51(5), pages 2542-2558, February.
    3. Belitz, Christiane & Lang, Stefan, 2008. "Simultaneous selection of variables and smoothing parameters in structured additive regression models," Computational Statistics & Data Analysis, Elsevier, vol. 53(1), pages 61-81, September.
    4. Göran Kauermann & Tatyana Krivobokova & Ludwig Fahrmeir, 2009. "Some asymptotic results on generalized penalized spline smoothing," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 71(2), pages 487-503.
    5. W. Sauerbrei & P. Royston, 1999. "Building multivariable prognostic and diagnostic models: transformation of the predictors by using fractional polynomials," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 162(1), pages 71-94.
    6. Brezger, Andreas & Lang, Stefan, 2006. "Generalized structured additive regression based on Bayesian P-splines," Computational Statistics & Data Analysis, Elsevier, vol. 50(4), pages 967-991, February.
    7. Simon N. Wood, 2004. "Stable and Efficient Multiple Smoothing Parameter Estimation for Generalized Additive Models," Journal of the American Statistical Association, American Statistical Association, vol. 99, pages 673-686, January.
    8. Ruppert,David & Wand,M. P. & Carroll,R. J., 2003. "Semiparametric Regression," Cambridge Books, Cambridge University Press, number 9780521785167.
    9. Marx, Brian D. & Eilers, Paul H. C., 1998. "Direct generalized additive modeling with penalized likelihood," Computational Statistics & Data Analysis, Elsevier, vol. 28(2), pages 193-209, August.
    10. Simon N. Wood, 2008. "Fast stable direct fitting and smoothness selection for generalized additive models," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 70(3), pages 495-518.
    11. Andreas Brezger & Thomas Kneib & Stefan Lang, . "BayesX: Analyzing Bayesian Structural Additive Regression Models," Journal of Statistical Software, American Statistical Association, vol. 14(i11).
    12. Ruppert,David & Wand,M. P. & Carroll,R. J., 2003. "Semiparametric Regression," Cambridge Books, Cambridge University Press, number 9780521780506.
    13. Sauerbrei, W. & Meier-Hirmer, C. & Benner, A. & Royston, P., 2006. "Multivariable regression model building by using fractional polynomials: Description of SAS, STATA and R programs," Computational Statistics & Data Analysis, Elsevier, vol. 50(12), pages 3464-3485, August.
    Full references (including those not matched with items on IDEAS)

    This item is not listed on Wikipedia, on a reading list or among the top items on IDEAS.

    When requesting a correction, please mention this item's handle: RePEc:eee:csdana:v:55:y:2011:i:4:p:1540-1551. See general information about how to correct material in RePEc.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: (Zhang, Lei)

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If references are entirely missing, you can add them using this form.

    If the full references list an item that is present in RePEc, but the system did not link to it, you can help with this form.

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your profile, as there may be some citations waiting for confirmation.

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    This information is provided to you by IDEAS at the Research Division of the Federal Reserve Bank of St. Louis using RePEc data.