IDEAS home Printed from https://ideas.repec.org/a/spr/scient/v101y2014i2d10.1007_s11192-013-1228-9.html
   My bibliography  Save this article

Recommending research collaborations using link prediction and random forest classifiers

Author

Listed:
  • Raf Guns

    (University of Antwerp)

  • Ronald Rousseau

    (University of Antwerp
    KU Leuven)

Abstract

We introduce a method to predict or recommend high-potential future (i.e., not yet realized) collaborations. The proposed method is based on a combination of link prediction and machine learning techniques. First, a weighted co-authorship network is constructed. We calculate scores for each node pair according to different measures called predictors. The resulting scores can be interpreted as indicative of the likelihood of future linkage for the given node pair. To determine the relative merit of each predictor, we train a random forest classifier on older data. The same classifier can then generate predictions for newer data. The top predictions are treated as recommendations for future collaboration. We apply the technique to research collaborations between cities in Africa, the Middle East and South-Asia, focusing on the topics of malaria and tuberculosis. Results show that the method yields accurate recommendations. Moreover, the method can be used to determine the relative strengths of each predictor.

Suggested Citation

  • Raf Guns & Ronald Rousseau, 2014. "Recommending research collaborations using link prediction and random forest classifiers," Scientometrics, Springer;Akadémiai Kiadó, vol. 101(2), pages 1461-1473, November.
  • Handle: RePEc:spr:scient:v:101:y:2014:i:2:d:10.1007_s11192-013-1228-9
    DOI: 10.1007/s11192-013-1228-9
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s11192-013-1228-9
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s11192-013-1228-9?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to

    for a different version of it.

    References listed on IDEAS

    as
    1. Nelius Boshoff, 2010. "South–South research collaboration of countries in the Southern African Development Community (SADC)," Scientometrics, Springer;Akadémiai Kiadó, vol. 84(2), pages 481-503, August.
    2. Torben Schubert & Radhamany Sooryamoorthy, 2010. "Can the centre–periphery model explain patterns of international scientific collaboration among threshold and industrialised countries? The case of South Africa and Germany," Scientometrics, Springer;Akadémiai Kiadó, vol. 83(1), pages 181-203, April.
    3. Naoki Shibata & Yuya Kajikawa & Ichiro Sakata, 2012. "Link prediction in citation networks," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 63(1), pages 78-85, January.
    4. Leo Egghe & Ronald Rousseau, 2003. "A measure for the cohesion of weighted networks," Journal of the American Society for Information Science and Technology, Association for Information Science & Technology, vol. 54(3), pages 193-202, February.
    5. Naoki Shibata & Yuya Kajikawa & Ichiro Sakata, 2012. "Link prediction in citation networks," Journal of the American Society for Information Science and Technology, Association for Information Science & Technology, vol. 63(1), pages 78-85, January.
    6. Frenken, Koen & Hardeman, Sjoerd & Hoekman, Jarno, 2009. "Spatial scientometrics: Towards a cumulative research program," Journal of Informetrics, Elsevier, vol. 3(3), pages 222-232.
    7. Nees Jan Eck & Ludo Waltman, 2010. "Software survey: VOSviewer, a computer program for bibliometric mapping," Scientometrics, Springer;Akadémiai Kiadó, vol. 84(2), pages 523-538, August.
    8. Leo Katz, 1953. "A new status index derived from sociometric analysis," Psychometrika, Springer;The Psychometric Society, vol. 18(1), pages 39-43, March.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Nazim Choudhury & Shahadat Uddin, 2016. "Time-aware link prediction to explore network effects on temporal knowledge evolution," Scientometrics, Springer;Akadémiai Kiadó, vol. 108(2), pages 745-776, August.
    2. Yan, Erjia & Guns, Raf, 2014. "Predicting and recommending collaborations: An author-, institution-, and country-level analysis," Journal of Informetrics, Elsevier, vol. 8(2), pages 295-309.
    3. Guns, Raf & Wang, Lili, 2017. "Detecting the emergence of new scientific collaboration links in Africa: A comparison of expected and realized collaboration intensities," Journal of Informetrics, Elsevier, vol. 11(3), pages 892-903.
    4. Luis Araya-Castillo & Felipe Hernández-Perlines & Hugo Moraga & Antonio Ariza-Montes, 2021. "Scientometric Analysis of Research on Socioemotional Wealth," Sustainability, MDPI, vol. 13(7), pages 1-26, March.
    5. Ángel Acevedo-Duque & Alejandro Vega-Muñoz & Guido Salazar-Sepúlveda, 2020. "Analysis of Hospitality, Leisure, and Tourism Studies in Chile," Sustainability, MDPI, vol. 12(18), pages 1-20, September.
    6. Yichi Zhang & Zhiliang Dong & Sen Liu & Peixiang Jiang & Cuizhi Zhang & Chao Ding, 2021. "Forecast of International Trade of Lithium Carbonate Products in Importing Countries and Small-Scale Exporting Countries," Sustainability, MDPI, vol. 13(3), pages 1-23, January.
    7. Sasaki, Hajime & Sakata, Ichiro, 2021. "Identifying potential technological spin-offs using hierarchical information in international patent classification," Technovation, Elsevier, vol. 100(C).
    8. Adilson Vital & Diego R. Amancio, 2022. "A comparative analysis of local similarity metrics and machine learning approaches: application to link prediction in author citation networks," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(10), pages 6011-6028, October.
    9. Bornmann, Lutz & Waltman, Ludo, 2011. "The detection of “hot regions” in the geography of science—A visualization approach by using density maps," Journal of Informetrics, Elsevier, vol. 5(4), pages 547-553.
    10. Wang, Feifei & Dong, Jiaxin & Lu, Wanzhao & Xu, Shuo, 2023. "Collaboration prediction based on multilayer all-author tripartite citation networks: A case study of gene editing," Journal of Informetrics, Elsevier, vol. 17(1).
    11. Dosso, Mafini & Cassi, Lorenzo & Mescheba, Wilfriedo, 2023. "Towards regional scientific integration in Africa? Evidence from co-publications," Research Policy, Elsevier, vol. 52(1).
    12. Nelson Casimiro Zavale & Patrício Vitorino Langa, 2018. "University-industry linkages’ literature on Sub-Saharan Africa: systematic literature review and bibliometric account," Scientometrics, Springer;Akadémiai Kiadó, vol. 116(1), pages 1-49, July.
    13. Copiello, Sergio, 2019. "Peer and neighborhood effects: Citation analysis using a spatial autoregressive model and pseudo-spatial data," Journal of Informetrics, Elsevier, vol. 13(1), pages 238-254.
    14. Zhi Li & Qinke Peng & Che Liu, 2016. "Two citation-based indicators to measure latent referential value of papers," Scientometrics, Springer;Akadémiai Kiadó, vol. 108(3), pages 1299-1313, September.
    15. Sulaimon Oyeniyi Adebayo & Munish Saini, 2023. "A scientometric study for scientific research publication on gender inequality," Quality & Quantity: International Journal of Methodology, Springer, vol. 57(6), pages 5107-5135, December.
    16. Jing Ma & Yaohui Pan & Chih-Yi Su, 2022. "Organization-oriented technology opportunities analysis based on predicting patent networks: a case of Alzheimer’s disease," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(9), pages 5497-5517, September.
    17. Elizabeth S. Vieira, 2022. "International research collaboration in Africa: a bibliometric and thematic analysis," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(5), pages 2747-2772, May.
    18. Lutz Bornmann & Loet Leydesdorff, 2011. "Which cities produce more excellent papers than can be expected? A new mapping approach, using Google Maps, based on statistical significance testing," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 62(10), pages 1954-1962, October.
    19. Giorgia Bondanini & Gabriele Giorgi & Antonio Ariza-Montes & Alejandro Vega-Muñoz & Paola Andreucci-Annunziata, 2020. "Technostress Dark Side of Technology in the Workplace: A Scientometric Analysis," IJERPH, MDPI, vol. 17(21), pages 1-23, October.
    20. Tofighy, Sajjad & Charkari, Nasrollah Moghadam & Ghaderi, Foad, 2022. "Link prediction in multiplex networks using intralayer probabilistic distance and interlayer co-evolving factors," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 606(C).

    More about this item

    Keywords

    ;
    ;
    ;
    ;
    ;
    ;
    ;

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:scient:v:101:y:2014:i:2:d:10.1007_s11192-013-1228-9. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.