IDEAS home Printed from https://ideas.repec.org/a/gam/jstats/v4y2021i3p42-724d627431.html
   My bibliography  Save this article

Cross-Validation, Information Theory, or Maximum Likelihood? A Comparison of Tuning Methods for Penalized Splines

Author

Listed:
  • Lauren N. Berry

    (Department of Psychology, University of Minnesota, Minneapolis, MN 55455, USA
    School of Statistics, University of Minnesota, Minneapolis, MN 55455, USA
    Current address: Department of Statistics, Grand Valley State University, Allendale, MI 49401, USA.)

  • Nathaniel E. Helwig

    (Department of Psychology, University of Minnesota, Minneapolis, MN 55455, USA
    School of Statistics, University of Minnesota, Minneapolis, MN 55455, USA)

Abstract

Functional data analysis techniques, such as penalized splines, have become common tools used in a variety of applied research settings. Penalized spline estimators are frequently used in applied research to estimate unknown functions from noisy data. The success of these estimators depends on choosing a tuning parameter that provides the correct balance between fitting and smoothing the data. Several different smoothing parameter selection methods have been proposed for choosing a reasonable tuning parameter. The proposed methods generally fall into one of three categories: cross-validation methods, information theoretic methods, or maximum likelihood methods. Despite the well-known importance of selecting an ideal smoothing parameter, there is little agreement in the literature regarding which method(s) should be considered when analyzing real data. In this paper, we address this issue by exploring the practical performance of six popular tuning methods under a variety of simulated and real data situations. Our results reveal that maximum likelihood methods outperform the popular cross-validation methods in most situations—especially in the presence of correlated errors. Furthermore, our results reveal that the maximum likelihood methods perform well even when the errors are non-Gaussian and/or heteroscedastic. For real data applications, we recommend comparing results using cross-validation and maximum likelihood tuning methods, given that these methods tend to perform similarly (differently) when the model is correctly (incorrectly) specified.

Suggested Citation

  • Lauren N. Berry & Nathaniel E. Helwig, 2021. "Cross-Validation, Information Theory, or Maximum Likelihood? A Comparison of Tuning Methods for Penalized Splines," Stats, MDPI, vol. 4(3), pages 1-24, September.
  • Handle: RePEc:gam:jstats:v:4:y:2021:i:3:p:42-724:d:627431
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2571-905X/4/3/42/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2571-905X/4/3/42/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Yuedong Wang, 1998. "Mixed effects smoothing spline analysis of variance," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 60(1), pages 159-174.
    2. Young‐Ju Kim & Chong Gu, 2004. "Smoothing spline Gaussian regression: more scalable computation via efficient approximation," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 66(2), pages 337-356, May.
    3. Lee, Thomas C. M., 2003. "Smoothing parameter selection for smoothing splines: a simulation study," Computational Statistics & Data Analysis, Elsevier, vol. 42(1-2), pages 139-148, February.
    4. Krivobokova, Tatyana & Kauermann, Goran, 2007. "A Note on Penalized Spline Smoothing With Correlated Errors," Journal of the American Statistical Association, American Statistical Association, vol. 102, pages 1328-1337, December.
    5. Ruppert,David & Wand,M. P. & Carroll,R. J., 2003. "Semiparametric Regression," Cambridge Books, Cambridge University Press, number 9780521785167.
    6. Zack W. Almquist & Nathaniel E. Helwig & Yun You, 2020. "Connecting Continuum of Care point-in-time homeless counts to United States Census areal units," Mathematical Population Studies, Taylor & Francis Journals, vol. 27(1), pages 46-58, January.
    7. Ruppert,David & Wand,M. P. & Carroll,R. J., 2003. "Semiparametric Regression," Cambridge Books, Cambridge University Press, number 9780521780506.
    8. Philip T. Reiss & R. Todd Ogden, 2009. "Smoothing parameter selection for a class of semiparametric linear models," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 71(2), pages 505-523, April.
    9. Inyoung Kim & Noah D. Cohen & Raymond J. Carroll, 2003. "Semiparametric Regression Splines in Matched Case-Control Studies," Biometrics, The International Biometric Society, vol. 59(4), pages 1158-1169, December.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Nathaniel E. Helwig, 2022. "Robust Permutation Tests for Penalized Splines," Stats, MDPI, vol. 5(3), pages 1-18, September.
    2. Hain, Martin & Kargus, Tobias & Schermeyer, Hans & Uhrig-Homburg, Marliese & Fichtner, Wolf, 2022. "An electricity price modeling framework for renewable-dominant markets," Working Paper Series in Production and Energy 66, Karlsruhe Institute of Technology (KIT), Institute for Industrial Production (IIP).

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Reiss Philip T. & Huang Lei, 2012. "Smoothness Selection for Penalized Quantile Regression Splines," The International Journal of Biostatistics, De Gruyter, vol. 8(1), pages 1-27, May.
    2. Nathaniel E. Helwig, 2022. "Robust Permutation Tests for Penalized Splines," Stats, MDPI, vol. 5(3), pages 1-18, September.
    3. Dursun AYDIN & Ersin YILMAZ, 2017. "Bandwidth Selection Problem for Nonparametric Regression Model with Right-Censored Data," Romanian Statistical Review, Romanian Statistical Review, vol. 65(2), pages 81-104, June.
    4. Kauermann Goeran & Krivobokova Tatyana & Semmler Willi, 2011. "Filtering Time Series with Penalized Splines," Studies in Nonlinear Dynamics & Econometrics, De Gruyter, vol. 15(2), pages 1-28, March.
    5. Michael Wegener & Göran Kauermann, 2017. "Forecasting in nonlinear univariate time series using penalized splines," Statistical Papers, Springer, vol. 58(3), pages 557-576, September.
    6. Morteza Amini & Mahdi Roozbeh & Nur Anisah Mohamed, 2024. "Separation of the Linear and Nonlinear Covariates in the Sparse Semi-Parametric Regression Model in the Presence of Outliers," Mathematics, MDPI, vol. 12(2), pages 1-17, January.
    7. Feng, Yuanhua & Härdle, Wolfgang Karl, 2020. "A data-driven P-spline smoother and the P-Spline-GARCH models," IRTG 1792 Discussion Papers 2020-016, Humboldt University of Berlin, International Research Training Group 1792 "High Dimensional Nonstationary Time Series".
    8. Andrada Ivanescu & Ana-Maria Staicu & Fabian Scheipl & Sonja Greven, 2015. "Penalized function-on-function regression," Computational Statistics, Springer, vol. 30(2), pages 539-568, June.
    9. Blöchl, Andreas, 2014. "Penalized Splines as Frequency Selective Filters - Reducing the Excess Variability at the Margins," Discussion Papers in Economics 20687, University of Munich, Department of Economics.
    10. Øystein Sørensen & Anders M. Fjell & Kristine B. Walhovd, 2023. "Longitudinal Modeling of Age-Dependent Latent Traits with Generalized Additive Latent and Mixed Models," Psychometrika, Springer;The Psychometric Society, vol. 88(2), pages 456-486, June.
    11. Marta Karas & Damian Brzyski & Mario Dzemidzic & Joaquín Goñi & David A. Kareken & Timothy W. Randolph & Jaroslaw Harezlak, 2019. "Brain Connectivity-Informed Regularization Methods for Regression," Statistics in Biosciences, Springer;International Chinese Statistical Association, vol. 11(1), pages 47-90, April.
    12. Zanin, Luca, 2023. "A flexible estimation of sectoral portfolio exposure to climate transition risks in the European stock market," Journal of Behavioral and Experimental Finance, Elsevier, vol. 39(C).
    13. Roel Verbelen & Katrien Antonio & Gerda Claeskens, 2018. "Unravelling the predictive power of telematics data in car insurance pricing," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 67(5), pages 1275-1304, November.
    14. Bloechl, Andreas, 2014. "Penalized Splines, Mixed Models and the Wiener-Kolmogorov Filter," Discussion Papers in Economics 21406, University of Munich, Department of Economics.
    15. Marra, Giampiero & Radice, Rosalba, 2013. "Estimation of a regression spline sample selection model," Computational Statistics & Data Analysis, Elsevier, vol. 61(C), pages 158-173.
    16. Takuma Yoshida, 2016. "Asymptotics and smoothing parameter selection for penalized spline regression with various loss functions," Statistica Neerlandica, Netherlands Society for Statistics and Operations Research, vol. 70(4), pages 278-303, November.
    17. Simon N. Wood, 2006. "Low-Rank Scale-Invariant Tensor Product Smooths for Generalized Additive Mixed Models," Biometrics, The International Biometric Society, vol. 62(4), pages 1025-1036, December.
    18. Philip T. Reiss & Lei Huang & Pei‐Shien Wu & Huaihou Chen & Stan Colcombe, 2017. "Pointwise influence matrices for functional‐response regression," Biometrics, The International Biometric Society, vol. 73(4), pages 1092-1101, December.
    19. Park, Jun Young & Polzehl, Joerg & Chatterjee, Snigdhansu & Brechmann, André & Fiecas, Mark, 2020. "Semiparametric modeling of time-varying activation and connectivity in task-based fMRI data," Computational Statistics & Data Analysis, Elsevier, vol. 150(C).
    20. Anusha, "undated". "Evaluating reliability of some symmetric and asymmetric univariate filters," Indira Gandhi Institute of Development Research, Mumbai Working Papers 2015-030, Indira Gandhi Institute of Development Research, Mumbai, India.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jstats:v:4:y:2021:i:3:p:42-724:d:627431. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.