IDEAS home Printed from https://ideas.repec.org/p/osf/osfxxx/h29qv.html

Cross-domain Visual Exploration of Academic Corpora via the Latent Meaning of User-authored Keywords

Author

Listed:
  • Benito-Santos, Alejandro
  • Theron, Roberto

Abstract

Nowadays, scholars dedicate a substantial amount of their work to the querying and browsing of increasingly large collections of research papers on the Internet. In parallel, the recent surge of novel interdisciplinary approaches in science requires scholars to acquire competencies in new fields for which they may lack the necessary vocabulary to formulate adequate queries. This problem, together with the issue of information overload, poses new challenges in the fields of natural language processing (NLP) and visualization design that call for a rapid response from the scientific community. In this respect, we report on a novel visualization scheme that enables the exploration of research paper collections via the analysis of semantic proximity relationships found in author-assigned keywords. Our proposal replaces traditional string queries by a bag-of-words (BoW) extracted from a user-generated auxiliary corpus that captures the intentionality of the research. Continuing on the line established by previous works, we combine novel advances in the fields of NLP with visual network analysis techniques to offer scholars a perspective of the target corpus that better fits their research needs. To highlight the advantages of our proposal, we conduct two experiments employing a collection of visualization research papers and an auxiliary cross-domain BoW. Here, we showcase how our visualization can be used to maximize the effectiveness of a browsing session by enhancing the language acquisition task, which allows an effective extraction of knowledge that is in line with the users’ previous expectations.

Suggested Citation

  • Benito-Santos, Alejandro & Theron, Roberto, 2019. "Cross-domain Visual Exploration of Academic Corpora via the Latent Meaning of User-authored Keywords," OSF Preprints h29qv, Center for Open Science.
  • Handle: RePEc:osf:osfxxx:h29qv
    DOI: 10.31219/osf.io/h29qv
    as

    Download full text from publisher

    File URL: https://osf.io/download/5cd03e16103390001a9caba9/
    Download Restriction: no

    File URL: https://libkey.io/10.31219/osf.io/h29qv?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Rajagopal, 2014. "The Human Factors," Palgrave Macmillan Books, in: Architecting Enterprise, chapter 9, pages 225-249, Palgrave Macmillan.
    2. Tsung Teng Chen, 2012. "The development and empirical study of a literature review aiding system," Scientometrics, Springer;Akadémiai Kiadó, vol. 92(1), pages 105-116, July.
    3. Michael D. Gordon & Susan Dumais, 1998. "Using latent semantic indexing for literature based discovery," Journal of the American Society for Information Science, Association for Information Science & Technology, vol. 49(8), pages 674-685.
    4. Silva, Filipi N. & Amancio, Diego R. & Bardosova, Maria & Costa, Luciano da F. & Oliveira, Osvaldo N., 2016. "Using network science and text analytics to produce surveys in a scientific topic," Journal of Informetrics, Elsevier, vol. 10(2), pages 487-502.
    5. Carlos Olmeda-Gómez & Maria-Antonia Ovalle-Perandones & Antonio Perianes-Rodríguez, 2017. "Co-word analysis and thematic landscapes in Spanish information science literature, 1985–2014," Scientometrics, Springer;Akadémiai Kiadó, vol. 113(1), pages 195-217, October.
    6. Jeff Alstott & Ed Bullmore & Dietmar Plenz, 2014. "powerlaw: A Python Package for Analysis of Heavy-Tailed Distributions," PLOS ONE, Public Library of Science, vol. 9(1), pages 1-11, January.
    7. Howard D. White & Katherine W. McCain, 1998. "Visualizing a discipline: An author co‐citation analysis of information science, 1972–1995," Journal of the American Society for Information Science, Association for Information Science & Technology, vol. 49(4), pages 327-355.
    8. Scott Deerwester & Susan T. Dumais & George W. Furnas & Thomas K. Landauer & Richard Harshman, 1990. "Indexing by latent semantic analysis," Journal of the American Society for Information Science, Association for Information Science & Technology, vol. 41(6), pages 391-407, September.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Lu Huang & Yijie Cai & Erdong Zhao & Shengting Zhang & Yue Shu & Jiao Fan, 2022. "Measuring the interdisciplinarity of Information and Library Science interactions using citation analysis and semantic analysis," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(11), pages 6733-6761, November.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Ding, Ying, 2011. "Community detection: Topological vs. topical," Journal of Informetrics, Elsevier, vol. 5(4), pages 498-514.
    2. Marie Katsurai & Shunsuke Ono, 2019. "TrendNets: mapping emerging research trends from dynamic co-word networks via sparse representation," Scientometrics, Springer;Akadémiai Kiadó, vol. 121(3), pages 1583-1598, December.
    3. Andrej Kastrin & Dimitar Hristovski, 2021. "Scientometric analysis and knowledge mapping of literature-based discovery (1986–2020)," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(2), pages 1415-1451, February.
    4. Choudhury, Nazim & Faisal, Fahim & Khushi, Matloob, 2020. "Mining Temporal Evolution of Knowledge Graphs and Genealogical Features for Literature-based Discovery Prediction," Journal of Informetrics, Elsevier, vol. 14(3).
    5. Seyedmohammadreza Hosseini & Hamed Baziyad & Rasoul Norouzi & Sheida Jabbedari Khiabani & Győző Gidófalvi & Amir Albadvi & Abbas Alimohammadi & Seyedehsan Seyedabrishami, 2021. "Mapping the intellectual structure of GIS-T field (2008–2019): a dynamic co-word analysis," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(4), pages 2667-2688, April.
    6. Jason Portenoy & Jevin D. West, 2020. "Constructing and evaluating automated literature review systems," Scientometrics, Springer;Akadémiai Kiadó, vol. 125(3), pages 3233-3251, December.
    7. Jung, Sukhwan & Yoon, Wan Chul, 2020. "An alternative topic model based on Common Interest Authors for topic evolution analysis," Journal of Informetrics, Elsevier, vol. 14(3).
    8. repec:osf:osfxxx:h29qv_v1 is not listed on IDEAS
    9. Shaikh Moksadur Rahman, 2020. "Relationship between Job Satisfaction and Turnover Intention: Evidence from Bangladesh," Asian Business Review, Asian Business Consortium, vol. 10(2), pages 99-108.
    10. Irina Wedel & Michael Palk & Stefan Voß, 2022. "A Bilingual Comparison of Sentiment and Topics for a Product Event on Twitter," Information Systems Frontiers, Springer, vol. 24(5), pages 1635-1646, October.
    11. Miao, Lijuan & Sun, Zhanli & Ren, Yanjun & Schierhorn, Florian & Müller, Daniel, 2021. "Grassland greening on the Mongolian Plateau despite higher grazing intensity," EconStor Open Access Articles and Book Chapters, ZBW - Leibniz Information Centre for Economics, vol. 32(2), pages 792-802.
    12. Roberto Cerchione & Emilio Esposito & Maria Rosaria Spadaro, 2015. "The Spread of Knowledge Management in SMEs: A Scenario in Evolution," Sustainability, MDPI, vol. 7(8), pages 1-23, July.
    13. Wang Kai, 2019. "Towards a Taxonomy of Idea Generation Techniques," Foundations of Management, Sciendo, vol. 11(1), pages 65-80, January.
    14. Bridgelall, Raj & Stubbing, Edward, 2021. "Forecasting the effects of autonomous vehicles on land use," Technological Forecasting and Social Change, Elsevier, vol. 163(C).
    15. Provan, David J. & Woods, David D. & Dekker, Sidney W.A. & Rae, Andrew J., 2020. "Safety II professionals: How resilience engineering can transform safety practice," Reliability Engineering and System Safety, Elsevier, vol. 195(C).
    16. Bevilacqua, Maurizio & Ciarapica, Filippo Emanuele, 2018. "Human factor risk management in the process industry: A case study," Reliability Engineering and System Safety, Elsevier, vol. 169(C), pages 149-159.
    17. Geeraert, Joke & Rocha, Luis E.C. & Vandeviver, Christophe, 2024. "The impact of violent behavior on co-offender selection: Evidence of behavioral homophily," Journal of Criminal Justice, Elsevier, vol. 94(C).
    18. Kirathimo Muruga & Tatjana Vasiljeva, 2021. "Physicians' Dual Practice: A Theoretical Approach," Central European Business Review, Prague University of Economics and Business, vol. 2021(5), pages 1-20.
    19. Naveena Prakasam & Louisa Huxtable-Thomas, 2021. "Reddit: Affordances as an Enabler for Shifting Loyalties," Information Systems Frontiers, Springer, vol. 23(3), pages 723-751, June.
    20. Colin Jerolmack & Alexandra K. Murphy, 2019. "The Ethical Dilemmas and Social Scientific Trade-offs of Masking in Ethnography," Sociological Methods & Research, , vol. 48(4), pages 801-827, November.
    21. Valeriy Makarov & Albert Bakhtizin, 2014. "The Estimation Of The Regions’ Efficiency Of The Russian Federation Including The Intellectual Capital, The Characteristics Of Readiness For Innovation, Level Of Well-Being, And Quality Of Life," Economy of region, Centre for Economic Security, Institute of Economics of Ural Branch of Russian Academy of Sciences, vol. 1(4), pages 9-30.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:osf:osfxxx:h29qv. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: OSF (email available below). General contact details of provider: https://osf.io/preprints/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.