IDEAS home Printed from https://ideas.repec.org/a/gam/jmathe/v9y2021i15p1767-d601994.html
   My bibliography  Save this article

An Information-Explainable Random Walk Based Unsupervised Network Representation Learning Framework on Node Classification Tasks

Author

Listed:
  • Xin Xu

    (Department of Computer Science, College of Information Science and Technology, Northeast Normal University, Changchun 130117, China)

  • Yang Lu

    (Department of Computer Science, College of Information Science and Technology, Northeast Normal University, Changchun 130117, China)

  • Yupeng Zhou

    (Department of Computer Science, College of Information Science and Technology, Northeast Normal University, Changchun 130117, China)

  • Zhiguo Fu

    (Department of Computer Science, College of Information Science and Technology, Northeast Normal University, Changchun 130117, China)

  • Yanjie Fu

    (Department of Computer Science, College of Engineering and Computer Science, University of Central Florida, Orlando, FL 32816, USA)

  • Minghao Yin

    (Department of Computer Science, College of Information Science and Technology, Northeast Normal University, Changchun 130117, China)

Abstract

Network representation learning aims to learn low-dimensional, compressible, and distributed representational vectors of nodes in networks. Due to the expensive costs of obtaining label information of nodes in networks, many unsupervised network representation learning methods have been proposed, where random walk strategy is one of the wildly utilized approaches. However, the existing random walk based methods have some challenges, including: 1. The insufficiency of explaining what network knowledge in the walking path-samplings; 2. The adverse effects caused by the mixture of different information in networks; 3. The poor generality of the methods with hyper-parameters on different networks. This paper proposes an information-explainable random walk based unsupervised network representation learning framework named Probabilistic Accepted Walk (PAW) to obtain network representation from the perspective of the stationary distribution of networks. In the framework, we design two stationary distributions based on nodes’ self-information and local-information of networks to guide our proposed random walk strategy to learn representational vectors of networks through sampling paths of nodes. Numerous experimental results demonstrated that the PAW could obtain more expressive representation than the other six widely used unsupervised network representation learning baselines on four real-world networks in single-label and multi-label node classification tasks.

Suggested Citation

  • Xin Xu & Yang Lu & Yupeng Zhou & Zhiguo Fu & Yanjie Fu & Minghao Yin, 2021. "An Information-Explainable Random Walk Based Unsupervised Network Representation Learning Framework on Node Classification Tasks," Mathematics, MDPI, vol. 9(15), pages 1-14, July.
  • Handle: RePEc:gam:jmathe:v:9:y:2021:i:15:p:1767-:d:601994
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2227-7390/9/15/1767/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2227-7390/9/15/1767/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Carpenter, Bob & Gelman, Andrew & Hoffman, Matthew D. & Lee, Daniel & Goodrich, Ben & Betancourt, Michael & Brubaker, Marcus & Guo, Jiqiang & Li, Peter & Riddell, Allen, 2017. "Stan: A Probabilistic Programming Language," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 76(i01).
    2. Michael E. Tipping & Christopher M. Bishop, 1999. "Probabilistic Principal Component Analysis," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 61(3), pages 611-622.
    3. Traud, Amanda L. & Mucha, Peter J. & Porter, Mason A., 2012. "Social structure of Facebook networks," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 391(16), pages 4165-4180.
    4. Giulia Caruso & Stefano Antonio Gattone, 2019. "Waste Management Analysis in Developing Countries through Unsupervised Classification of Mixed Data," Social Sciences, MDPI, vol. 8(6), pages 1-15, June.
    5. LAMBIOTTE, Renaud & DELVENNE, Jean-Charles & BARAHONA, Mauricio, 2014. "Random walks, Markov processes and the multiscale modular organization of complex network," LIDAM Reprints CORE 2660, Université catholique de Louvain, Center for Operations Research and Econometrics (CORE).
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Rajbir-Singh Nirwan & Nils Bertschinger, 2018. "Applications of Gaussian Process Latent Variable Models in Finance," Papers 1806.03294, arXiv.org, revised Apr 2019.
    2. Wang, Zihan & Daeipour, Mohamad & Xu, Hongyi, 2023. "Quantification and propagation of Aleatoric uncertainties in topological structures," Reliability Engineering and System Safety, Elsevier, vol. 233(C).
    3. Francis,David C. & Kubinec ,Robert, 2022. "Beyond Political Connections : A Measurement Model Approach to Estimating Firm-levelPolitical Influence in 41 Economies," Policy Research Working Paper Series 10119, The World Bank.
    4. Martinovici, A., 2019. "Revealing attention - how eye movements predict brand choice and moment of choice," Other publications TiSEM 7dca38a5-9f78-4aee-bd81-c, Tilburg University, School of Economics and Management.
    5. Yongping Bao & Ludwig Danwitz & Fabian Dvorak & Sebastian Fehrler & Lars Hornuf & Hsuan Yu Lin & Bettina von Helversen, 2022. "Similarity and Consistency in Algorithm-Guided Exploration," CESifo Working Paper Series 10188, CESifo.
    6. Heinrich, Torsten & Yang, Jangho & Dai, Shuanping, 2020. "Growth, development, and structural change at the firm-level: The example of the PR China," MPRA Paper 105011, University Library of Munich, Germany.
    7. van Kesteren Erik-Jan & Bergkamp Tom, 2023. "Bayesian analysis of Formula One race results: disentangling driver skill and constructor advantage," Journal of Quantitative Analysis in Sports, De Gruyter, vol. 19(4), pages 273-293, December.
    8. Benjamin Davies & David C. Maré, 2020. "Delineating functional labour market areas with estimable classification stabilities," Working Papers 20_08, Motu Economic and Public Policy Research.
    9. Matteo Barigozzi & Matteo Luciani, 2019. "Quasi Maximum Likelihood Estimation and Inference of Large Approximate Dynamic Factor Models via the EM algorithm," Papers 1910.03821, arXiv.org, revised Sep 2024.
    10. Dorota Toczydlowska & Gareth W. Peters & Man Chung Fung & Pavel V. Shevchenko, 2017. "Stochastic Period and Cohort Effect State-Space Mortality Models Incorporating Demographic Factors via Probabilistic Robust Principal Components," Risks, MDPI, vol. 5(3), pages 1-77, July.
    11. Xiaoyue Xi & Simon E. F. Spencer & Matthew Hall & M. Kate Grabowski & Joseph Kagaayi & Oliver Ratmann & Rakai Health Sciences Program and PANGEA‐HIV, 2022. "Inferring the sources of HIV infection in Africa from deep‐sequence data with semi‐parametric Bayesian Poisson flow models," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 71(3), pages 517-540, June.
    12. Kuschnig, Nikolas, 2021. "Bayesian Spatial Econometrics and the Need for Software," Department of Economics Working Paper Series 318, WU Vienna University of Economics and Business.
    13. Deniz Aksoy & David Carlson, 2022. "Electoral support and militants’ targeting strategies," Journal of Peace Research, Peace Research Institute Oslo, vol. 59(2), pages 229-241, March.
    14. Matteo Barigozzi & Marc Hallin, 2023. "Dynamic Factor Models: a Genealogy," Papers 2310.17278, arXiv.org, revised Jan 2024.
    15. Richard Hunt & Shelton Peiris & Neville Weber, 2022. "Estimation methods for stationary Gegenbauer processes," Statistical Papers, Springer, vol. 63(6), pages 1707-1741, December.
    16. Chen, Tao & Martin, Elaine & Montague, Gary, 2009. "Robust probabilistic PCA with missing data and contribution analysis for outlier detection," Computational Statistics & Data Analysis, Elsevier, vol. 53(10), pages 3706-3716, August.
    17. Artur F. Tomeczek & Tomasz M. Napiórkowski, 2024. "PageRank and Regression as a Two-Step Approach to Analysing a Network of Nasdaq Firms During a Recession: Insights from Minimum Spanning Tree Topology," Gospodarka Narodowa. The Polish Journal of Economics, Warsaw School of Economics, issue 3, pages 56-69.
    18. D. Fouskakis & G. Petrakos & I. Rotous, 2020. "A Bayesian longitudinal model for quantifying students’ preferences regarding teaching quality indicators," METRON, Springer;Sapienza Università di Roma, vol. 78(2), pages 255-270, August.
    19. Chen, Andrew Y. & McCoy, Jack, 2024. "Missing values handling for machine learning portfolios," Journal of Financial Economics, Elsevier, vol. 155(C).
    20. Wang, Shao-Hsuan & Huang, Su-Yun, 2022. "Perturbation theory for cross data matrix-based PCA," Journal of Multivariate Analysis, Elsevier, vol. 190(C).

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jmathe:v:9:y:2021:i:15:p:1767-:d:601994. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.