IDEAS home Printed from https://ideas.repec.org/a/gam/jftint/v14y2022i1p25-d720733.html
   My bibliography  Save this article

Mobility in Unsupervised Word Embeddings for Knowledge Extraction—The Scholars’ Trajectories across Research Topics

Author

Listed:
  • Gianfranco Lombardo

    (Department of Engineering and Architecture (DIA), University of Parma, 43100 Parma, Italy
    These authors contributed equally to this work.)

  • Michele Tomaiuolo

    (Department of Engineering and Architecture (DIA), University of Parma, 43100 Parma, Italy
    These authors contributed equally to this work.)

  • Monica Mordonini

    (Department of Engineering and Architecture (DIA), University of Parma, 43100 Parma, Italy
    These authors contributed equally to this work.)

  • Gaia Codeluppi

    (Department of Engineering and Architecture (DIA), University of Parma, 43100 Parma, Italy
    These authors contributed equally to this work.)

  • Agostino Poggi

    (Department of Engineering and Architecture (DIA), University of Parma, 43100 Parma, Italy)

Abstract

In the knowledge discovery field of the Big Data domain the analysis of geographic positioning and mobility information plays a key role. At the same time, in the Natural Language Processing (NLP) domain pre-trained models such as BERT and word embedding algorithms such as Word2Vec enabled a rich encoding of words that allows mapping textual data into points of an arbitrary multi-dimensional space, in which the notion of proximity reflects an association among terms or topics. The main contribution of this paper is to show how analytical tools, traditionally adopted to deal with geographic data to measure the mobility of an agent in a time interval, can also be effectively applied to extract knowledge in a semantic realm, such as a semantic space of words and topics, looking for latent trajectories that can benefit the properties of neural network latent representations. As a case study, the Scopus database was queried about works of highly cited researchers in recent years. On this basis, we performed a dynamic analysis, for measuring the Radius of Gyration as an index of the mobility of researchers across scientific topics. The semantic space is built from the automatic analysis of the paper abstracts of each author. In particular, we evaluated two different methodologies to build the semantic space and we found that Word2Vec embeddings perform better than the BERT ones for this task. Finally, The scholars’ trajectories show some latent properties of this model, which also represent new scientific contributions of this work. These properties include ( i ) the correlation between the scientific mobility and the achievement of scientific results, measured through the H-index; ( ii ) differences in the behavior of researchers working in different countries and subjects; and ( iii ) some interesting similarities between mobility patterns in this semantic realm and those typically observed in the case of human mobility.

Suggested Citation

  • Gianfranco Lombardo & Michele Tomaiuolo & Monica Mordonini & Gaia Codeluppi & Agostino Poggi, 2022. "Mobility in Unsupervised Word Embeddings for Knowledge Extraction—The Scholars’ Trajectories across Research Topics," Future Internet, MDPI, vol. 14(1), pages 1-21, January.
  • Handle: RePEc:gam:jftint:v:14:y:2022:i:1:p:25-:d:720733
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/1999-5903/14/1/25/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/1999-5903/14/1/25/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Marta C. González & César A. Hidalgo & Albert-László Barabási, 2009. "Understanding individual human mobility patterns," Nature, Nature, vol. 458(7235), pages 238-238, March.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Esra Gündoğan & Mehmet Kaya & Ali Daud, 2023. "Deep learning for journal recommendation system of research papers," Scientometrics, Springer;Akadémiai Kiadó, vol. 128(1), pages 461-481, January.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Jeong-Hui Park & Eunhye Yoo & Youngdeok Kim & Jung-Min Lee, 2021. "What Happened Pre- and during COVID-19 in South Korea? Comparing Physical Activity, Sleep Time, and Body Weight Status," IJERPH, MDPI, vol. 18(11), pages 1-13, May.
    2. Matteo Böhm & Mirco Nanni & Luca Pappalardo, 2022. "Gross polluters and vehicle emissions reduction," Nature Sustainability, Nature, vol. 5(8), pages 699-707, August.
    3. Su, Rongxiang & Xiao, Jingyi & McBride, Elizabeth C. & Goulias, Konstadinos G., 2021. "Understanding senior's daily mobility patterns in California using human mobility motifs," Journal of Transport Geography, Elsevier, vol. 94(C).
    4. Robert Stewart & Marie Urban & Samantha Duchscherer & Jason Kaufman & April Morton & Gautam Thakur & Jesse Piburn & Jessica Moehl, 2016. "A Bayesian machine learning model for estimating building occupancy from open source data," Natural Hazards: Journal of the International Society for the Prevention and Mitigation of Natural Hazards, Springer;International Society for the Prevention and Mitigation of Natural Hazards, vol. 81(3), pages 1929-1956, April.
    5. Arroyo Arroyo,Fatima & Fernandez Gonzalez,Marta & Matekenya,Dunstan & Espinet Alegre,Xavier, 2021. "Using Mobile Data to Understand Urban Mobility Patterns in Freetown, Sierra Leone," Policy Research Working Paper Series 9519, The World Bank.
    6. David Kofoed Wind & Piotr Sapiezynski & Magdalena Anna Furman & Sune Lehmann, 2016. "Inferring Stop-Locations from WiFi," PLOS ONE, Public Library of Science, vol. 11(2), pages 1-15, February.
    7. Zhou, Xingang & Yeh, Anthony G.O. & Yue, Yang, 2018. "Spatial variation of self-containment and jobs-housing balance in Shenzhen using cellphone big data," Journal of Transport Geography, Elsevier, vol. 68(C), pages 102-108.
    8. Maxime Lenormand & Miguel Picornell & Oliva G Cantú-Ros & Antònia Tugores & Thomas Louail & Ricardo Herranz & Marc Barthelemy & Enrique Frías-Martínez & José J Ramasco, 2014. "Cross-Checking Different Sources of Mobility Information," PLOS ONE, Public Library of Science, vol. 9(8), pages 1-10, August.
    9. Miotti, Marco & Needell, Zachary A. & Jain, Rishee K., 2023. "The impact of urban form on daily mobility demand and energy use: Evidence from the United States," Applied Energy, Elsevier, vol. 339(C).
    10. Zheng Yan & Wenqian Robertson & Yaosheng Lou & Tom W. Robertson & Sung Yong Park, 2021. "Finding leading scholars in mobile phone behavior: a mixed-method analysis of an emerging interdisciplinary field," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(12), pages 9499-9517, December.
    11. Huang, Feihu & Qiao, Shaojie & Peng, Jian & Guo, Bing & Xiong, Xi & Han, Nan, 2019. "A movement model for air passengers based on trip purpose," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 525(C), pages 798-808.
    12. Duan, Zhengyu & Zhao, Haoran & Li, Zhenming, 2023. "Non-linear effects of built environment and socio-demographics on activity space," Journal of Transport Geography, Elsevier, vol. 111(C).
    13. Vanky, Anthony & Courtney, Theodore & Verma, Santosh & Ratti, Carlo, 2016. "One to Many: Opportunities to Understanding Collective Behaviors in Urban Environments Through Individual's Passively-Collected Locative Data," SocArXiv f7mpd, Center for Open Science.
    14. Shanshan Wan & Zhuo Chen & Cheng Lyu & Ruofan Li & Yuntao Yue & Ying Liu, 2022. "Research on disaster information dissemination based on social sensor networks," International Journal of Distributed Sensor Networks, , vol. 18(3), pages 15501329221, March.
    15. Elisa Frutos-Bernal & Ángel Martín del Rey & Irene Mariñas-Collado & María Teresa Santos-Martín, 2022. "An Analysis of Travel Patterns in Barcelona Metro Using Tucker3 Decomposition," Mathematics, MDPI, vol. 10(7), pages 1-17, March.
    16. Zhai, Wei & Bai, Xueyin & Peng, Zhong-ren & Gu, Chaolin, 2019. "From edit distance to augmented space-time-weighted edit distance: Detecting and clustering patterns of human activities in Puget Sound region," Journal of Transport Geography, Elsevier, vol. 78(C), pages 41-55.
    17. Johannes Stübinger & Lucas Schneider, 2020. "Understanding Smart City—A Data-Driven Literature Review," Sustainability, MDPI, vol. 12(20), pages 1-23, October.
    18. Khajehnejad, Moein, 2019. "Efficiency of long-range navigation on Treelike fractals," Chaos, Solitons & Fractals, Elsevier, vol. 122(C), pages 102-110.
    19. Xingang Zhou & Anthony GO Yeh & Weifeng Li & Yang Yue, 2018. "A commuting spectrum analysis of the jobs–housing balance and self-containment of employment with mobile phone location big data," Environment and Planning B, , vol. 45(3), pages 434-451, May.
    20. Chaogui Kang & Yu Liu & Diansheng Guo & Kun Qin, 2015. "A Generalized Radiation Model for Human Mobility: Spatial Scale, Searching Direction and Trip Constraint," PLOS ONE, Public Library of Science, vol. 10(11), pages 1-11, November.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jftint:v:14:y:2022:i:1:p:25-:d:720733. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.