IDEAS home Printed from https://ideas.repec.org/a/eee/csdana/v154y2021ics016794732030181x.html
   My bibliography  Save this article

Goodness-of-fit test for latent block models

Author

Listed:
  • Watanabe, Chihiro
  • Suzuki, Taiji

Abstract

Latent block models are used for probabilistic biclustering, which is shown to be an effective method for analyzing various relational data sets. However, there has been no statistical test method for determining the row and column cluster numbers of latent block models. Recent studies have constructed statistical-test-based methods for stochastic block models, which assume that the observed matrix is a square symmetric matrix and that the cluster assignments are the same for rows and columns. In this study, we developed a new goodness-of-fit test for latent block models to test whether an observed data matrix fits a given set of row and column cluster numbers, or it consists of more clusters in at least one direction of the row and the column. To construct the test method, we used a result from the random matrix theory for a sample covariance matrix. We experimentally demonstrated the effectiveness of the proposed method by showing the asymptotic behavior of the test statistic and measuring the test accuracy.

Suggested Citation

  • Watanabe, Chihiro & Suzuki, Taiji, 2021. "Goodness-of-fit test for latent block models," Computational Statistics & Data Analysis, Elsevier, vol. 154(C).
  • Handle: RePEc:eee:csdana:v:154:y:2021:i:c:s016794732030181x
    DOI: 10.1016/j.csda.2020.107090
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S016794732030181X
    Download Restriction: Full text for ScienceDirect subscribers only.

    File URL: https://libkey.io/10.1016/j.csda.2020.107090?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Peter J. Bickel & Purnamrita Sarkar, 2016. "Hypothesis testing for automated community detection in networks," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 78(1), pages 253-273, January.
    2. Wyse, Jason & Friel, Nial & Latouche, Pierre, 2017. "Inferring structure in bipartite networks using the latent blockmodel and exact ICL," Network Science, Cambridge University Press, vol. 5(1), pages 45-69, March.
    3. Kehui Chen & Jing Lei, 2018. "Network Cross-Validation for Determining the Number of Communities in Network Data," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 113(521), pages 241-251, January.
    4. Tianxi Li & Elizaveta Levina & Ji Zhu, 2020. "Network cross-validation by edge sampling," Biometrika, Biometrika Trust, vol. 107(2), pages 257-276.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Jianqing Fan & Yingying Fan & Xiao Han & Jinchi Lv, 2022. "SIMPLE: Statistical inference on membership profiles in large networks," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 84(2), pages 630-653, April.
    2. Mingyang Ren & Sanguo Zhang & Junhui Wang, 2023. "Consistent estimation of the number of communities via regularized network embedding," Biometrics, The International Biometric Society, vol. 79(3), pages 2404-2416, September.
    3. Yuan, Quan & Liu, Binghui, 2021. "Community detection via an efficient nonconvex optimization approach based on modularity," Computational Statistics & Data Analysis, Elsevier, vol. 157(C).
    4. Yong Cai, 2022. "Linear Regression with Centrality Measures," Papers 2210.10024, arXiv.org.
    5. Tidarat Luangrungruang & Urachart Kokaew, 2022. "Adapting Fleming-Type Learning Style Classifications to Deaf Student Behavior," Sustainability, MDPI, vol. 14(8), pages 1-16, April.
    6. Thorben Funke & Till Becker, 2019. "Stochastic block models: A comparison of variants and inference methods," PLOS ONE, Public Library of Science, vol. 14(4), pages 1-40, April.
    7. Etienne Côme & Nicolas Jouvin & Pierre Latouche & Charles Bouveyron, 2021. "Hierarchical clustering with discrete latent variable models and the integrated classification likelihood," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 15(4), pages 957-986, December.
    8. Lu, Hong & Sang, Xiaoshuang & Zhao, Qinghua & Lu, Jianfeng, 2020. "Community detection algorithm based on nonnegative matrix factorization and pairwise constraints," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 545(C).
    9. Schlembach, Christoph & Schmidt, Sascha L. & Schreyer, Dominik & Wunderlich, Linus, 2022. "Forecasting the Olympic medal distribution – A socioeconomic machine learning model," Technological Forecasting and Social Change, Elsevier, vol. 175(C).
    10. Bergé, Laurent R. & Bouveyron, Charles & Corneli, Marco & Latouche, Pierre, 2019. "The latent topic block model for the co-clustering of textual interaction data," Computational Statistics & Data Analysis, Elsevier, vol. 137(C), pages 247-270.
    11. Alessandro Casa & Charles Bouveyron & Elena Erosheva & Giovanna Menardi, 2021. "Co-clustering of Time-Dependent Data via the Shape Invariant Model," Journal of Classification, Springer;The Classification Society, vol. 38(3), pages 626-649, October.
    12. Li Guo & Wolfgang Karl Hardle & Yubo Tao, 2018. "A Time-Varying Network for Cryptocurrencies," Papers 1802.03708, arXiv.org, revised Nov 2022.
    13. Can M. Le & Tianxi Li, 2022. "Linear regression and its inference on noisy network‐linked data," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 84(5), pages 1851-1885, November.
    14. Anirban Dasgupta & Srijan Sengupta, 2022. "Scalable Estimation of Epidemic Thresholds via Node Sampling," Sankhya A: The Indian Journal of Statistics, Springer;Indian Statistical Institute, vol. 84(1), pages 321-344, June.
    15. Deng, Jiayi & Huang, Danyang & Ding, Yi & Zhu, Yingqiu & Jing, Bingyi & Zhang, Bo, 2024. "Subsampling spectral clustering for stochastic block models in large-scale networks," Computational Statistics & Data Analysis, Elsevier, vol. 189(C).
    16. Jesús Arroyo & Elizaveta Levina, 2022. "Overlapping Community Detection in Networks via Sparse Spectral Decomposition," Sankhya A: The Indian Journal of Statistics, Springer;Indian Statistical Institute, vol. 84(1), pages 1-35, June.
    17. Valerie Robert & Yann Vasseur & Vincent Brault, 2021. "Comparing High-Dimensional Partitions with the Co-clustering Adjusted Rand Index," Journal of Classification, Springer;The Classification Society, vol. 38(1), pages 158-186, April.
    18. Mingao Yuan & Fan Yang & Zuofeng Shang, 2022. "Hypothesis testing in sparse weighted stochastic block model," Statistical Papers, Springer, vol. 63(4), pages 1051-1073, August.
    19. C. Biernacki & J. Jacques & C. Keribin, 2023. "A Survey on Model-Based Co-Clustering: High Dimension and Estimation Challenges," Journal of Classification, Springer;The Classification Society, vol. 40(2), pages 332-381, July.
    20. Neil Hwang & Jiarui Xu & Shirshendu Chatterjee & Sharmodeep Bhattacharyya, 2022. "The Bethe Hessian and Information Theoretic Approaches for Online Change-Point Detection in Network Data," Sankhya A: The Indian Journal of Statistics, Springer;Indian Statistical Institute, vol. 84(1), pages 283-320, June.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:csdana:v:154:y:2021:i:c:s016794732030181x. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/csda .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.