IDEAS home Printed from https://ideas.repec.org/a/spr/stpapr/v59y2018i4d10.1007_s00362-018-1043-8.html
   My bibliography  Save this article

Block tensor train decomposition for missing data estimation

Author

Listed:
  • Namgil Lee

    (Kangwon National University)

  • Jong-Min Kim

    (University of Minnesota-Morris)

Abstract

We propose a method for imputation of missing values in large scale matrix data based on a low-rank tensor approximation technique called the block tensor train (BTT) decomposition. Given sparsely observed data points, the proposed method iteratively computes the singular value decomposition (SVD) of the underlying data matrix with missing values. The SVD of the matrices is performed based on a low-rank BTT decomposition, by which storage and time complexities can be reduced dramatically for large-scale data matrices admitting a low-rank tensor structure. An iterative soft-thresholding algorithm is implemented for missing data estimation based on an alternating least squares method for BTT decomposition. Experimental results on simulated data and real benchmark data demonstrate that the proposed method can estimate a large amount of missing values accurately compared to a matrix-based standard method. The R source code of the BTT-based imputation method is available at https://github.com/namgillee/BTTSoftImpute .

Suggested Citation

  • Namgil Lee & Jong-Min Kim, 2018. "Block tensor train decomposition for missing data estimation," Statistical Papers, Springer, vol. 59(4), pages 1283-1305, December.
  • Handle: RePEc:spr:stpapr:v:59:y:2018:i:4:d:10.1007_s00362-018-1043-8
    DOI: 10.1007/s00362-018-1043-8
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s00362-018-1043-8
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s00362-018-1043-8?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. J. Carroll & Jih-Jie Chang, 1970. "Analysis of individual differences in multidimensional scaling via an n-way generalization of “Eckart-Young” decomposition," Psychometrika, Springer;The Psychometric Society, vol. 35(3), pages 283-319, September.
    2. GILLIS, Nicolas & GLINEUR, François, 2010. "Low-rank matrix approximation with weights or missing data is NP-hard," LIDAM Discussion Papers CORE 2010075, Université catholique de Louvain, Center for Operations Research and Econometrics (CORE).
    3. Henk Kiers, 1997. "Weighted least squares fitting using ordinary least squares algorithms," Psychometrika, Springer;The Psychometric Society, vol. 62(2), pages 251-266, June.
    4. Ledyard Tucker, 1966. "Some mathematical notes on three-mode factor analysis," Psychometrika, Springer;The Psychometric Society, vol. 31(3), pages 279-311, September.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Mariela González-Narváez & María José Fernández-Gómez & Susana Mendes & José-Luis Molina & Omar Ruiz-Barzola & Purificación Galindo-Villardón, 2021. "Study of Temporal Variations in Species–Environment Association through an Innovative Multivariate Method: MixSTATICO," Sustainability, MDPI, vol. 13(11), pages 1-25, May.
    2. Henk Kiers, 1991. "Hierarchical relations among three-way methods," Psychometrika, Springer;The Psychometric Society, vol. 56(3), pages 449-470, September.
    3. Willem Kloot & Pieter Kroonenberg, 1985. "External analysis with three-mode principal component models," Psychometrika, Springer;The Psychometric Society, vol. 50(4), pages 479-494, December.
    4. Elisa Frutos-Bernal & Ángel Martín del Rey & Irene Mariñas-Collado & María Teresa Santos-Martín, 2022. "An Analysis of Travel Patterns in Barcelona Metro Using Tucker3 Decomposition," Mathematics, MDPI, vol. 10(7), pages 1-17, March.
    5. Yoshio Takane & Forrest Young & Jan Leeuw, 1977. "Nonmetric individual differences multidimensional scaling: An alternating least squares method with optimal scaling features," Psychometrika, Springer;The Psychometric Society, vol. 42(1), pages 7-67, March.
    6. Serrano Cinca, C. & Mar Molinero, C. & Gallizo Larraz, J.L., 2005. "Country and size effects in financial ratios: A European perspective," Global Finance Journal, Elsevier, vol. 16(1), pages 26-47, August.
    7. Giuseppe Brandi & Ruggero Gramatica & Tiziana Di Matteo, 2019. "Unveil stock correlation via a new tensor-based decomposition method," Papers 1911.06126, arXiv.org, revised Apr 2020.
    8. Wilderjans, Tom & Ceulemans, Eva & Van Mechelen, Iven, 2009. "Simultaneous analysis of coupled data blocks differing in size: A comparison of two weighting schemes," Computational Statistics & Data Analysis, Elsevier, vol. 53(4), pages 1086-1098, February.
    9. Andrii Babii & Eric Ghysels & Junsu Pan, 2022. "Tensor Principal Component Analysis," Papers 2212.12981, arXiv.org, revised Aug 2023.
    10. Sagarra, Marti & Mar-Molinero, Cecilio & Agasisti, Tommaso, 2017. "Exploring the efficiency of Mexican universities: Integrating Data Envelopment Analysis and Multidimensional Scaling," Omega, Elsevier, vol. 67(C), pages 123-133.
    11. Xing, Jiping & Wu, Wei & Cheng, Qixiu & Liu, Ronghui, 2022. "Traffic state estimation of urban road networks by multi-source data fusion: Review and new insights," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 595(C).
    12. Donatella Vicari & Paolo Giordani, 2023. "CPclus: Candecomp/Parafac Clustering Model for Three-Way Data," Journal of Classification, Springer;The Classification Society, vol. 40(2), pages 432-465, July.
    13. Zhang, Shuang & Han, Le, 2023. "Robust tensor recovery with nonconvex and nonsmooth regularization," Applied Mathematics and Computation, Elsevier, vol. 438(C).
    14. Richard Sands & Forrest Young, 1980. "Component models for three-way data: An alternating least squares algorithm with optimal scaling features," Psychometrika, Springer;The Psychometric Society, vol. 45(1), pages 39-67, March.
    15. Michel Velden & Tammo Bijmolt, 2006. "Generalized canonical correlation analysis of matrices with missing rows: a simulation study," Psychometrika, Springer;The Psychometric Society, vol. 71(2), pages 323-331, June.
    16. Paolo Giordani & Roberto Rocci & Giuseppe Bove, 2020. "Factor Uniqueness of the Structural Parafac Model," Psychometrika, Springer;The Psychometric Society, vol. 85(3), pages 555-574, September.
    17. Alwin Stegeman & Tam Lam, 2014. "Three-Mode Factor Analysis by Means of Candecomp/Parafac," Psychometrika, Springer;The Psychometric Society, vol. 79(3), pages 426-443, July.
    18. Chen Ling & Gaohang Yu & Liqun Qi & Yanwei Xu, 2021. "T-product factorization method for internet traffic data completion with spatio-temporal regularization," Computational Optimization and Applications, Springer, vol. 80(3), pages 883-913, December.
    19. Giordani, Paolo, 2010. "Three-way analysis of imprecise data," Journal of Multivariate Analysis, Elsevier, vol. 101(3), pages 568-582, March.
    20. Zhang, Tonglin, 2020. "CP decomposition and weighted clique problem," Statistics & Probability Letters, Elsevier, vol. 161(C).

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:stpapr:v:59:y:2018:i:4:d:10.1007_s00362-018-1043-8. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.