IDEAS home Printed from https://ideas.repec.org/p/ant/wpaper/2015001.html
   My bibliography  Save this paper

Classification over bipartite graphs through projection

Author

Listed:
  • STANKOVA, Marija
  • MARTENS, David
  • PROVOST, Foster

Abstract

Many real-world large datasets correspond to bipartite graph data settings; think for example of users rating movies or people visiting locations. Although some work exists over such bigraphs, no general network-oriented methodology has been proposed yet to perform node classification. In this paper we propose a three-stage classification framework that effectively deals with the typical very large size of such datasets. First, a weighting of the top nodes is defined. Secondly, the bigraph is projected into a unipartite (homogenous) graph among the bottom nodes, where the weights of the edges are a function of the weights of the top nodes in the bigraph. Finally, relational learners/classifiers are applied to the resulting weighted unigraph. This general framework allows us to explore the design space, by applying different choices for the three stages, introducing new alternatives and mixing-and-matching to create new techniques. We present an empirical study of the predictive and run-time performances for different combinations of functions in the three stages over a large collection of bipartite datasets. There are clear differences in predictive performance with different design choices. Based on these results, we propose several specific combinations that show good accuracy and also allow for easy and fast scaling to big datasets. A comparison with a linear SVM method on the adjacency matrix of the bigraph shows the superiority of the network-oriented approach.

Suggested Citation

  • STANKOVA, Marija & MARTENS, David & PROVOST, Foster, 2015. "Classification over bipartite graphs through projection," Working Papers 2015001, University of Antwerp, Faculty of Business and Economics.
  • Handle: RePEc:ant:wpaper:2015001
    as

    Download full text from publisher

    File URL: https://repository.uantwerpen.be/docman/irua/07acff/c5909d64.pdf
    Download Restriction: no

    References listed on IDEAS

    as
    1. repec:cup:cbooks:9780511771576 is not listed on IDEAS
    2. Seierstad, Cathrine & Opsahl, Tore, 2011. "For the few not the many? The effects of affirmative action on presence, prominence, and social capital of women directors in Norway," Scandinavian Journal of Management, Elsevier, vol. 27(1), pages 44-54, March.
    3. Martens, David & Baesens, Bart & Van Gestel, Tony & Vanthienen, Jan, 2007. "Comprehensible credit scoring models using rule extraction from support vector machines," European Journal of Operational Research, Elsevier, vol. 183(3), pages 1466-1476, December.
    4. Ramon Ferrer i Cancho & Ricard V. Solé, 2001. "The Small-World of Human Language," Working Papers 01-03-016, Santa Fe Institute.
    5. Guillaume, Jean-Loup & Latapy, Matthieu, 2006. "Bipartite graphs as models of complex networks," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 371(2), pages 795-813.
    6. Easley,David & Kleinberg,Jon, 2010. "Networks, Crowds, and Markets," Cambridge Books, Cambridge University Press, number 9780521195331, December.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. DE CNUDDE, Sofie & MOEYERSOMS, Julie & STANKOVA, Marija & TOBBACK, Ellen & JAVALY, Vinayak & MARTENS, David, 2015. "Who cares about your Facebook friends? Credit scoring for microfinance," Working Papers 2015018, University of Antwerp, Faculty of Business and Economics.
    2. TOBBACK, Ellen & MOEYERSOMS, Julie & STANKOVA, Marija & MARTENS, David, 2016. "Bankruptcy prediction for SMEs using relational data," Working Papers 2016004, University of Antwerp, Faculty of Business and Economics.
    3. TOBBACK, Ellen & MARTENS, David, 2017. "Retail credit scoring using fine-grained payment data," Working Papers 2017011, University of Antwerp, Faculty of Business and Economics.

    More about this item

    Keywords

    Bipartite graphs; Two-mode networks; Affiliation networks; Node classification; Big data;

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:ant:wpaper:2015001. See general information about how to correct material in RePEc.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: (Joeri Nys). General contact details of provider: http://edirc.repec.org/data/ftufsbe.html .

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service hosted by the Research Division of the Federal Reserve Bank of St. Louis . RePEc uses bibliographic data supplied by the respective publishers.