IDEAS home Printed from https://ideas.repec.org/a/bla/biomet/v77y2021i3p852-865.html
   My bibliography  Save this article

Cluster non‐Gaussian functional data

Author

Listed:
  • Qingzhi Zhong
  • Huazhen Lin
  • Yi Li

Abstract

Gaussian distributions have been commonly assumed when clustering functional data. When the normality condition fails, biased results will follow. Additional challenges occur as the number of the clusters is often unknown a priori. This paper focuses on clustering non‐Gaussian functional data without the prior information of the number of clusters. We introduce a semiparametric mixed normal transformation model to accommodate non‐Gaussian functional data, and propose a penalized approach to simultaneously estimate the parameters, transformation function, and the number of clusters. The estimators are shown to be consistent and asymptotically normal. The practical utility of the methods is confirmed via simulations as well as an application of the analysis of Alzheimer's disease study. The proposed method yields much less classification error than the existing methods. Data used in preparation of this paper were obtained from the Alzheimer's Disease Neuroimaging Initiative database.

Suggested Citation

  • Qingzhi Zhong & Huazhen Lin & Yi Li, 2021. "Cluster non‐Gaussian functional data," Biometrics, The International Biometric Society, vol. 77(3), pages 852-865, September.
  • Handle: RePEc:bla:biomet:v:77:y:2021:i:3:p:852-865
    DOI: 10.1111/biom.13349
    as

    Download full text from publisher

    File URL: https://doi.org/10.1111/biom.13349
    Download Restriction: no

    File URL: https://libkey.io/10.1111/biom.13349?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. C. Abraham & P. A. Cornillon & E. Matzner‐Løber & N. Molinari, 2003. "Unsupervised Curve Clustering using B‐Splines," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 30(3), pages 581-595, September.
    2. Jeng‐Min Chiou & Pai‐Ling Li, 2007. "Functional clustering and identifying substructures of longitudinal data," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 69(4), pages 679-699, September.
    3. Gorgens, Tue & Horowitz, Joel L., 1999. "Semiparametric estimation of a censored regression model with an unknown transformation of the dependent variable," Journal of Econometrics, Elsevier, vol. 90(2), pages 155-191, June.
    4. Kani Chen & Xingwei Tong, 2010. "Varying coefficient transformation models with censored data," Biometrika, Biometrika Trust, vol. 97(4), pages 969-976.
    5. Aurore Delaigle & Peter Hall & Tung Pham, 2019. "Clustering functional data into groups by using projections," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 81(2), pages 271-304, April.
    6. Liu, Xueli & Yang, Mark C.K., 2009. "Simultaneous curve registration and clustering for functional data," Computational Statistics & Data Analysis, Elsevier, vol. 53(4), pages 1361-1376, February.
    7. Floriello, Davide & Vitelli, Valeria, 2017. "Sparse clustering of functional data," Journal of Multivariate Analysis, Elsevier, vol. 154(C), pages 1-18.
    8. Xiao‐Hua Zhou & Huazhen Lin & Eric Johnson, 2008. "Non‐parametric heteroscedastic transformation regression models for skewed data with an application to health care costs," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 70(5), pages 1029-1047, November.
    9. Chen, Xuerong & Hu, Tao & Sun, Jianguo, 2017. "Sieve maximum likelihood estimation for the proportional hazards model under informative censoring," Computational Statistics & Data Analysis, Elsevier, vol. 112(C), pages 224-234.
    10. Peter Hall & Mohammad Hosseini‐Nasab, 2006. "On properties of functional principal components analysis," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 68(1), pages 109-126, February.
    11. Nicoleta Serban & Huijing Jiang, 2012. "Multilevel Functional Clustering Analysis," Biometrics, The International Biometric Society, vol. 68(3), pages 805-814, September.
    12. Yao, Fang & Muller, Hans-Georg & Wang, Jane-Ling, 2005. "Functional Data Analysis for Sparse Longitudinal Data," Journal of the American Statistical Association, American Statistical Association, vol. 100, pages 577-590, June.
    13. Shuichi Tokushige & Hiroshi Yadohisa & Koichi Inada, 2007. "Crisp and fuzzy k-means clustering algorithms for multivariate functional data," Computational Statistics, Springer, vol. 22(1), pages 1-16, April.
    14. Peter Hall & Hans‐Georg Müller & Fang Yao, 2008. "Modelling sparse generalized longitudinal observations with latent Gaussian processes," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 70(4), pages 703-723, September.
    15. Ling Ma & Tao Hu & Jianguo Sun, 2015. "Sieve maximum likelihood regression analysis of dependent current status data," Biometrika, Biometrika Trust, vol. 102(3), pages 731-738.
    16. Julien Jacques & Cristian Preda, 2014. "Functional data clustering: a survey," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 8(3), pages 231-255, September.
    17. Hansheng Wang & Runze Li & Chih-Ling Tsai, 2007. "Tuning parameter selectors for the smoothly clipped absolute deviation method," Biometrika, Biometrika Trust, vol. 94(3), pages 553-568.
    18. Horowitz, Joel L, 1996. "Semiparametric Estimation of a Regression Model with an Unknown Transformation of the Dependent Variable," Econometrica, Econometric Society, vol. 64(1), pages 103-137, January.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Jiang, Jiakun & Lin, Huazhen & Zhong, Qingzhi & Li, Yi, 2022. "Analysis of multivariate non-gaussian functional data: A semiparametric latent process approach," Journal of Multivariate Analysis, Elsevier, vol. 189(C).
    2. Kim, Joonpyo & Oh, Hee-Seok, 2020. "Pseudo-quantile functional data clustering," Journal of Multivariate Analysis, Elsevier, vol. 178(C).
    3. Julien Jacques & Cristian Preda, 2014. "Functional data clustering: a survey," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 8(3), pages 231-255, September.
    4. Li, Pai-Ling & Chiou, Jeng-Min, 2011. "Identifying cluster number for subspace projected functional data clustering," Computational Statistics & Data Analysis, Elsevier, vol. 55(6), pages 2090-2103, June.
    5. Golovkine, Steven & Klutchnikoff, Nicolas & Patilea, Valentin, 2022. "Clustering multivariate functional data using unsupervised binary trees," Computational Statistics & Data Analysis, Elsevier, vol. 168(C).
    6. Adriano Zanin Zambom & Julian A. A. Collazos & Ronaldo Dias, 2019. "Functional data clustering via hypothesis testing k-means," Computational Statistics, Springer, vol. 34(2), pages 527-549, June.
    7. Li, Ting & Song, Xinyuan & Zhang, Yingying & Zhu, Hongtu & Zhu, Zhongyi, 2021. "Clusterwise functional linear regression models," Computational Statistics & Data Analysis, Elsevier, vol. 158(C).
    8. Li, Yehua & Qiu, Yumou & Xu, Yuhang, 2022. "From multivariate to functional data analysis: Fundamentals, recent developments, and emerging areas," Journal of Multivariate Analysis, Elsevier, vol. 188(C).
    9. Jacques, Julien & Preda, Cristian, 2014. "Model-based clustering for multivariate functional data," Computational Statistics & Data Analysis, Elsevier, vol. 71(C), pages 92-106.
    10. Yifan Zhu & Chongzhi Di & Ying Qing Chen, 2019. "Clustering Functional Data with Application to Electronic Medication Adherence Monitoring in HIV Prevention Trials," Statistics in Biosciences, Springer;International Chinese Statistical Association, vol. 11(2), pages 238-261, July.
    11. Yao Luo & Isabelle Perrigne & Quang Vuong, 2018. "Structural Analysis of Nonlinear Pricing," Journal of Political Economy, University of Chicago Press, vol. 126(6), pages 2523-2568.
    12. Poskitt, D.S. & Sengarapillai, Arivalzahan, 2013. "Description length and dimensionality reduction in functional data analysis," Computational Statistics & Data Analysis, Elsevier, vol. 58(C), pages 98-113.
    13. Michael Vogt & Oliver Linton, 2015. "Classification of nonparametric regression functions in heterogeneous panels," CeMMAP working papers CWP06/15, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    14. Chenlin Zhang & Huazhen Lin & Li Liu & Jin Liu & Yi Li, 2023. "Functional data analysis with covariate‐dependent mean and covariance structures," Biometrics, The International Biometric Society, vol. 79(3), pages 2232-2245, September.
    15. Michael Vogt & Oliver Linton, 2017. "Classification of non-parametric regression functions in longitudinal data models," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 79(1), pages 5-27, January.
    16. Michael Vogt & Oliver Linton, 2015. "Classification of nonparametric regression functions in heterogeneous panels," CeMMAP working papers 06/15, Institute for Fiscal Studies.
    17. Fang, Kuangnan & Chen, Yuanxing & Ma, Shuangge & Zhang, Qingzhao, 2022. "Biclustering analysis of functionals via penalized fusion," Journal of Multivariate Analysis, Elsevier, vol. 189(C).
    18. Kehui Chen & Xiaoke Zhang & Alexander Petersen & Hans-Georg Müller, 2017. "Quantifying Infinite-Dimensional Data: Functional Data Analysis in Action," Statistics in Biosciences, Springer;International Chinese Statistical Association, vol. 9(2), pages 582-604, December.
    19. Neumeyer, Natalie & Noh, Hohsuk & Van Keilegom, Ingrid, 2014. "Heteroscedastic semiparametric transformation models: estimation and testing for validity," LIDAM Discussion Papers ISBA 2014047, Université catholique de Louvain, Institute of Statistics, Biostatistics and Actuarial Sciences (ISBA).
    20. Aneiros, Germán & Horová, Ivana & Hušková, Marie & Vieu, Philippe, 2022. "On functional data analysis and related topics," Journal of Multivariate Analysis, Elsevier, vol. 189(C).

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bla:biomet:v:77:y:2021:i:3:p:852-865. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Wiley Content Delivery (email available below). General contact details of provider: http://www.blackwellpublishing.com/journal.asp?ref=0006-341X .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.