IDEAS home Printed from https://ideas.repec.org/a/eee/jmvana/v189y2022ics0047259x21001469.html
   My bibliography  Save this article

Data driven orthogonal basis selection for functional data analysis

Author

Listed:
  • Basna, Rani
  • Nassar, Hiba
  • Podgórski, Krzysztof

Abstract

Functional data analysis is typically performed in two steps: first, functionally representing discrete observations, and then applying functional methods, such as the functional principal component analysis, to the so-represented data. While the initial choice of a functional representation may have a significant impact on the second phase of the analysis, this issue has not gained much attention in the past. Typically, a rather ad hoc choice of some standard basis such as Fourier, wavelets, splines, etc. is used for the data transforming purpose. To address this important problem, we present its mathematical formulation, demonstrate its importance, and propose a data-driven method of functionally representing observations. The method chooses an initial functional basis by an efficient placement of the knots. A simple machine learning style algorithm is utilized for the knot selection and recently introduced orthogonal spline bases - splinets - are eventually taken to represent the data. The benefits are illustrated by examples of analyses of sparse functional data.

Suggested Citation

  • Basna, Rani & Nassar, Hiba & Podgórski, Krzysztof, 2022. "Data driven orthogonal basis selection for functional data analysis," Journal of Multivariate Analysis, Elsevier, vol. 189(C).
  • Handle: RePEc:eee:jmvana:v:189:y:2022:i:c:s0047259x21001469
    DOI: 10.1016/j.jmva.2021.104868
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0047259X21001469
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.jmva.2021.104868?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Molinari, Nicolas & Durand, Jean-Francois & Sabatier, Robert, 2004. "Bounded optimal knots for regression splines," Computational Statistics & Data Analysis, Elsevier, vol. 45(2), pages 159-178, March.
    2. Zhou S. & Shen X., 2001. "Spatially Adaptive Regression Splines and Accurate Knot Selection Schemes," Journal of the American Statistical Association, American Statistical Association, vol. 96, pages 247-259, March.
    3. D. G. T. Denison & B. K. Mallick & A. F. M. Smith, 1998. "Automatic Bayesian curve fitting," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 60(2), pages 333-350.
    4. Yao, Fang & Muller, Hans-Georg & Wang, Jane-Ling, 2005. "Functional Data Analysis for Sparse Longitudinal Data," Journal of the American Statistical Association, American Statistical Association, vol. 100, pages 577-590, June.
    5. Jianhua Guo & Jianchang Hu & Bing-Yi Jing & Zhen Zhang, 2016. "Spline-Lasso in High-Dimensional Linear Regression," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 111(513), pages 288-297, March.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Botts, Carsten H. & Daniels, Michael J., 2008. "A flexible approach to Bayesian multiple curve fitting," Computational Statistics & Data Analysis, Elsevier, vol. 52(12), pages 5100-5120, August.
    2. Janet Niekerk & Haakon Bakka & Håvard Rue, 2023. "Stable Non-Linear Generalized Bayesian Joint Models for Survival-Longitudinal Data," Sankhya A: The Indian Journal of Statistics, Springer;Indian Statistical Institute, vol. 85(1), pages 102-128, February.
    3. Johnson, Matthew S., 2007. "Modeling dichotomous item responses with free-knot splines," Computational Statistics & Data Analysis, Elsevier, vol. 51(9), pages 4178-4192, May.
    4. Binder, Harald & Sauerbrei, Willi, 2008. "Increasing the usefulness of additive spline models by knot removal," Computational Statistics & Data Analysis, Elsevier, vol. 52(12), pages 5305-5318, August.
    5. Anestis Antoniadis & Irène Gijbels & Mila Nikolova, 2011. "Penalized likelihood regression for generalized linear models with non-quadratic penalties," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 63(3), pages 585-615, June.
    6. Amato, Umberto & Antoniadis, Anestis & De Feis, Italia & Goude, Yannig & Lagache, Audrey, 2021. "Forecasting high resolution electricity demand data with additive models including smooth and jagged components," International Journal of Forecasting, Elsevier, vol. 37(1), pages 171-185.
    7. Wang, Jingxing & Chung, Seokhyun & AlShelahi, Abdullah & Kontar, Raed & Byon, Eunshin & Saigal, Romesh, 2021. "Look-ahead decision making for renewable energy: A dynamic “predict and store” approach," Applied Energy, Elsevier, vol. 296(C).
    8. Li, Pai-Ling & Chiou, Jeng-Min, 2011. "Identifying cluster number for subspace projected functional data clustering," Computational Statistics & Data Analysis, Elsevier, vol. 55(6), pages 2090-2103, June.
    9. Guangxing Wang & Sisheng Liu & Fang Han & Chong‐Zhi Di, 2023. "Robust functional principal component analysis via a functional pairwise spatial sign operator," Biometrics, The International Biometric Society, vol. 79(2), pages 1239-1253, June.
    10. Li, Pai-Ling & Chiou, Jeng-Min & Shyr, Yu, 2017. "Functional data classification using covariate-adjusted subspace projection," Computational Statistics & Data Analysis, Elsevier, vol. 115(C), pages 21-34.
    11. Poskitt, D.S. & Sengarapillai, Arivalzahan, 2013. "Description length and dimensionality reduction in functional data analysis," Computational Statistics & Data Analysis, Elsevier, vol. 58(C), pages 98-113.
    12. González-Rodríguez, Gil & Colubi, Ana, 2017. "On the consistency of bootstrap methods in separable Hilbert spaces," Econometrics and Statistics, Elsevier, vol. 1(C), pages 118-127.
    13. Kovács, Péter & Fekete, Andrea M., 2019. "Nonlinear least-squares spline fitting with variable knots," Applied Mathematics and Computation, Elsevier, vol. 354(C), pages 490-501.
    14. Shuxi Zeng & Elizabeth C. Lange & Elizabeth A. Archie & Fernando A. Campos & Susan C. Alberts & Fan Li, 2023. "A Causal Mediation Model for Longitudinal Mediators and Survival Outcomes with an Application to Animal Behavior," Journal of Agricultural, Biological and Environmental Statistics, Springer;The International Biometric Society;American Statistical Association, vol. 28(2), pages 197-218, June.
    15. J. Q. Shi & B. Wang & R. Murray-Smith & D. M. Titterington, 2007. "Gaussian Process Functional Regression Modeling for Batch Data," Biometrics, The International Biometric Society, vol. 63(3), pages 714-723, September.
    16. Gianluca Frasso & Jonathan Jaeger & Philippe Lambert, 2016. "Parameter estimation and inference in dynamic systems described by linear partial differential equations," AStA Advances in Statistical Analysis, Springer;German Statistical Society, vol. 100(3), pages 259-287, July.
    17. Weishampel, Anthony & Staicu, Ana-Maria & Rand, William, 2023. "Classification of social media users with generalized functional data analysis," Computational Statistics & Data Analysis, Elsevier, vol. 179(C).
    18. Nagy, Stanislav & Ferraty, Frédéric, 2019. "Data depth for measurable noisy random functions," Journal of Multivariate Analysis, Elsevier, vol. 170(C), pages 95-114.
    19. Chenlin Zhang & Huazhen Lin & Li Liu & Jin Liu & Yi Li, 2023. "Functional data analysis with covariate‐dependent mean and covariance structures," Biometrics, The International Biometric Society, vol. 79(3), pages 2232-2245, September.
    20. Tomáš Rubín & Victor M. Panaretos, 2020. "Functional lagged regression with sparse noisy observations," Journal of Time Series Analysis, Wiley Blackwell, vol. 41(6), pages 858-882, November.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:jmvana:v:189:y:2022:i:c:s0047259x21001469. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/wps/find/journaldescription.cws_home/622892/description#description .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.