IDEAS home Printed from
   My bibliography  Save this article

Description length and dimensionality reduction in functional data analysis


  • Poskitt, D.S.
  • Sengarapillai, Arivalzahan


The use of description length principles to select an appropriate number of basis functions for functional data is investigated. A flexible definition of the dimension of a random function that is constructed directly from the Karhunen–Loève expansion of the observed process or data generating mechanism is provided. The results obtained show that although the classical, principle component variance decomposition technique will behave in a coherent manner, in general, the dimension chosen by this technique will not be consistent in the conventional sense. Two description length criteria are described. Both of these criteria are proved to be consistent and it is shown that in low noise settings they will identify the true finite dimension of a signal that is embedded in noise. Two examples, one from mass spectroscopy and the other from climatology, are used to illustrate the basic ideas. The application of different forms of the bootstrap for functional data is also explored and used to demonstrate the workings of the theoretical results.

Suggested Citation

  • Poskitt, D.S. & Sengarapillai, Arivalzahan, 2013. "Description length and dimensionality reduction in functional data analysis," Computational Statistics & Data Analysis, Elsevier, vol. 58(C), pages 98-113.
  • Handle: RePEc:eee:csdana:v:58:y:2013:i:c:p:98-113 DOI: 10.1016/j.csda.2011.03.018

    Download full text from publisher

    File URL:
    Download Restriction: Full text for ScienceDirect subscribers only.

    As the access to this document is restricted, you may want to look for a different version below or search for a different version of it.

    Other versions of this item:

    References listed on IDEAS

    1. Hansen M. H & Yu B., 2001. "Model Selection and the Principle of Minimum Description Length," Journal of the American Statistical Association, American Statistical Association, vol. 96, pages 746-774, June.
    2. Jeng-Min Chiou & Pai-Ling Li, 2007. "Functional clustering and identifying substructures of longitudinal data," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 69(4), pages 679-699.
    3. Ferraty, Frédéric & Vieu, Philippe, 2009. "Additive prediction and boosting for functional data," Computational Statistics & Data Analysis, Elsevier, vol. 53(4), pages 1400-1413, February.
    4. Philippe C. Besse, 2000. "Autoregressive Forecasting of Some Functional Climatic Variations," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 27(4), pages 673-687.
    5. Peter Hall & Céline Vial, 2006. "Assessing the finite dimensionality of functional data," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 68(4), pages 689-705.
    6. Li, Baibing & Martin, Elaine B. & Morris, A. Julian, 2002. "On principal component analysis in L1," Computational Statistics & Data Analysis, Elsevier, vol. 40(3), pages 471-474, September.
    7. Yao, Fang & Muller, Hans-Georg & Wang, Jane-Ling, 2005. "Functional Data Analysis for Sparse Longitudinal Data," Journal of the American Statistical Association, American Statistical Association, vol. 100, pages 577-590, June.
    8. Ramsay, James O. & Ramsey, James B., 2002. "Functional data analysis of the dynamics of the monthly index of nondurable goods production," Journal of Econometrics, Elsevier, vol. 107(1-2), pages 327-344, March.
    9. Li, Bin & Yu, Qingzhao, 2008. "Classification of functional data: A segmentation approach," Computational Statistics & Data Analysis, Elsevier, vol. 52(10), pages 4790-4800, June.
    10. Ferraty, F. & Vieu, P., 2003. "Curves discrimination: a nonparametric functional approach," Computational Statistics & Data Analysis, Elsevier, vol. 44(1-2), pages 161-173, October.
    11. Boente, Graciela & Fraiman, Ricardo, 2000. "Kernel-based functional principal components," Statistics & Probability Letters, Elsevier, vol. 48(4), pages 335-345, July.
    12. Peter Hall & Mohammad Hosseini-Nasab, 2006. "On properties of functional principal components analysis," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 68(1), pages 109-126.
    Full references (including those not matched with items on IDEAS)


    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.

    Cited by:

    1. Jacques, Julien & Preda, Cristian, 2014. "Model-based clustering for multivariate functional data," Computational Statistics & Data Analysis, Elsevier, vol. 71(C), pages 92-106.
    2. Md Atikur Rahman Khan & D.S. Poskitt, 2010. "Description Length Based Signal Detection in singular Spectrum Analysis," Monash Econometrics and Business Statistics Working Papers 13/10, Monash University, Department of Econometrics and Business Statistics.
    3. Han Shang, 2014. "A survey of functional principal component analysis," AStA Advances in Statistical Analysis, Springer;German Statistical Society, vol. 98(2), pages 121-142, April.

    More about this item


    Bootstrap; Consistency; Dimension determination; Karhunen–Loève expansion; Signal-to-noise ratio; Variance decomposition;

    JEL classification:

    • C14 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General - - - Semiparametric and Nonparametric Methods: General
    • C22 - Mathematical and Quantitative Methods - - Single Equation Models; Single Variables - - - Time-Series Models; Dynamic Quantile Regressions; Dynamic Treatment Effect Models; Diffusion Processes


    Access and download statistics


    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:csdana:v:58:y:2013:i:c:p:98-113. See general information about how to correct material in RePEc.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: (Dana Niculescu). General contact details of provider: .

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service hosted by the Research Division of the Federal Reserve Bank of St. Louis . RePEc uses bibliographic data supplied by the respective publishers.