IDEAS home Printed from https://ideas.repec.org/a/spr/scient/v126y2021i2d10.1007_s11192-020-03789-8.html
   My bibliography  Save this article

Machine learning misclassification of academic publications reveals non-trivial interdependencies of scientific disciplines

Author

Listed:
  • Alexey Lyutov

    (Jacobs University)

  • Yilmaz Uygun

    (Jacobs University)

  • Marc-Thorsten Hütt

    (Jacobs University)

Abstract

Exploring the production of knowledge with quantitative methods is the foundation of scientometrics. In an application of machine learning to scientometrics, we here consider the classification problem of the mapping of academic publications to the subcategories of a multidisciplinary journal—and hence to scientific disciplines—based on the information contained in the abstract. In contrast to standard classification tasks, we are not interested in maximizing the accuracy, but rather we ask, whether the failures of an automatic classification are systematic and contain information about the system under investigation. These failures can be represented as a ’misclassification network’ inter-relating scientific disciplines. Here we show that this misclassification network (1) gives a markedly different pattern of interdependencies among scientific disciplines than common ’maps of science’, (2) reveals a statistical association between misclassification and citation frequencies, and (3) allows disciplines to be classified as ’method lenders’ and ’content explorers’, based on their in-degree out-degree asymmetry. On a more general level, in a wide range of machine learning applications misclassification networks have the potential of extracting systemic information from the failed classifications, thus allowing to visualize and quantitatively assess those aspects of a complex system, which are not machine learnable.

Suggested Citation

  • Alexey Lyutov & Yilmaz Uygun & Marc-Thorsten Hütt, 2021. "Machine learning misclassification of academic publications reveals non-trivial interdependencies of scientific disciplines," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(2), pages 1173-1186, February.
  • Handle: RePEc:spr:scient:v:126:y:2021:i:2:d:10.1007_s11192-020-03789-8
    DOI: 10.1007/s11192-020-03789-8
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s11192-020-03789-8
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s11192-020-03789-8?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. S. Redner, 1998. "How popular is your paper? An empirical study of the citation distribution," The European Physical Journal B: Condensed Matter and Complex Systems, Springer;EDP Sciences, vol. 4(2), pages 131-134, July.
    2. Samuel F. Way & Allison C. Morgan & Daniel B. Larremore & Aaron Clauset, 2019. "Productivity, prominence, and the effects of academic environment," Proceedings of the National Academy of Sciences, Proceedings of the National Academy of Sciences, vol. 116(22), pages 10729-10733, May.
    3. Michael J Stringer & Marta Sales-Pardo & Luís A Nunes Amaral, 2008. "Effectiveness of Journal Ranking Schemes as a Tool for Locating Information," PLOS ONE, Public Library of Science, vol. 3(2), pages 1-8, February.
    4. Theresa Velden & Kevin W. Boyack & Jochen Gläser & Rob Koopman & Andrea Scharnhorst & Shenghui Wang, 2017. "Comparison of topic extraction approaches and their results," Scientometrics, Springer;Akadémiai Kiadó, vol. 111(2), pages 1169-1221, May.
    5. Henry Small, 2010. "Maps of science as interdisciplinary discourse: co-citation contexts and the role of analogy," Scientometrics, Springer;Akadémiai Kiadó, vol. 83(3), pages 835-849, June.
    6. Staša Milojević & Filippo Radicchi & John P. Walsh, 2018. "Changing demographics of scientific careers: The rise of the temporary workforce," Proceedings of the National Academy of Sciences, Proceedings of the National Academy of Sciences, vol. 115(50), pages 12616-12623, December.
    7. Lingfei Wu & Dashun Wang & James A. Evans, 2019. "Large teams develop and small teams disrupt science and technology," Nature, Nature, vol. 566(7744), pages 378-382, February.
    8. Chang-Ping Hu & Ji-Ming Hu & Sheng-Li Deng & Yong Liu, 2013. "A co-word analysis of library and information science in China," Scientometrics, Springer;Akadémiai Kiadó, vol. 97(2), pages 369-382, November.
    9. R. Basurto-Flores & L. Guzmán-Vargas & S. Velasco & A. Medina & A. Calvo Hernandez, 2018. "On entropy research analysis: cross-disciplinary knowledge transfer," Scientometrics, Springer;Akadémiai Kiadó, vol. 117(1), pages 123-139, October.
    10. Yifang Ma & Brian Uzzi, 2018. "Scientific prize network predicts who pushes the boundaries of science," Proceedings of the National Academy of Sciences, Proceedings of the National Academy of Sciences, vol. 115(50), pages 12608-12615, December.
    11. Kevin W. Boyack & Richard Klavans & Katy Börner, 2005. "Mapping the backbone of science," Scientometrics, Springer;Akadémiai Kiadó, vol. 64(3), pages 351-374, August.
    12. Chyi-Kwei Yau & Alan Porter & Nils Newman & Arho Suominen, 2014. "Clustering scientific documents with topic modeling," Scientometrics, Springer;Akadémiai Kiadó, vol. 100(3), pages 767-786, September.
    13. Mingers, John & Leydesdorff, Loet, 2015. "A review of theory and practice in scientometrics," European Journal of Operational Research, Elsevier, vol. 246(1), pages 1-19.
    14. L. Krumov & C. Fretter & M. Müller-Hannemann & K. Weihe & M. Hütt, 2011. "Motifs in co-authorship networks and their relation to the impact of scientific publications," The European Physical Journal B: Condensed Matter and Complex Systems, Springer;EDP Sciences, vol. 84(4), pages 535-540, December.
    15. Alfredo Yegros-Yegros & Ismael Rafols & Pablo D’Este, 2015. "Does Interdisciplinary Research Lead to Higher Citation Impact? The Different Effect of Proximal and Distal Interdisciplinarity," PLOS ONE, Public Library of Science, vol. 10(8), pages 1-21, August.
    16. Loet Leydesdorff & Ismael Rafols, 2009. "A global map of science based on the ISI subject categories," Journal of the American Society for Information Science and Technology, Association for Information Science & Technology, vol. 60(2), pages 348-362, February.
    17. Qian-Jin Zong & Hong-Zhou Shen & Qin-Jian Yuan & Xiao-Wei Hu & Zhi-Ping Hou & Shun-Guo Deng, 2013. "Doctoral dissertations of Library and Information Science in China: A co-word analysis," Scientometrics, Springer;Akadémiai Kiadó, vol. 94(2), pages 781-799, February.
    18. Yi Zhang & Guangquan Zhang & Donghua Zhu & Jie Lu, 2017. "Scientific evolutionary pathways: Identifying and visualizing relationships for scientific topics," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 68(8), pages 1925-1939, August.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Jiang, Hanchen & Qiang, Maoshan & Lin, Peng, 2016. "A topic modeling based bibliometric exploration of hydropower research," Renewable and Sustainable Energy Reviews, Elsevier, vol. 57(C), pages 226-237.
    2. Frank Havemann & Jochen Gläser & Michael Heinz, 2017. "Memetic search for overlapping topics based on a local evaluation of link communities," Scientometrics, Springer;Akadémiai Kiadó, vol. 111(2), pages 1089-1118, May.
    3. Jochen Gläser & Wolfgang Glänzel & Andrea Scharnhorst, 2017. "Same data—different results? Towards a comparative approach to the identification of thematic structures in science," Scientometrics, Springer;Akadémiai Kiadó, vol. 111(2), pages 981-998, May.
    4. Wu, Lingfei & Kittur, Aniket & Youn, Hyejin & Milojević, Staša & Leahey, Erin & Fiore, Stephen M. & Ahn, Yong-Yeol, 2022. "Metrics and mechanisms: Measuring the unmeasurable in the science of science," Journal of Informetrics, Elsevier, vol. 16(2).
    5. Juste Raimbault, 2019. "Exploration of an interdisciplinary scientific landscape," Scientometrics, Springer;Akadémiai Kiadó, vol. 119(2), pages 617-641, May.
    6. Yan, Erjia & Ding, Ying & Cronin, Blaise & Leydesdorff, Loet, 2013. "A bird's-eye view of scientific trading: Dependency relations among fields of science," Journal of Informetrics, Elsevier, vol. 7(2), pages 249-264.
    7. Andrea Bonaccorsi & Nicola Melluso & Francesco Alessandro Massucci, 2022. "Exploring the antecedents of interdisciplinarity at the European Research Council: a topic modeling approach," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(12), pages 6961-6991, December.
    8. Keisuke Okamura, 2019. "Interdisciplinarity revisited: evidence for research impact and dynamism," Palgrave Communications, Palgrave Macmillan, vol. 5(1), pages 1-9, December.
    9. Christian Weismayer & Ilona Pezenka, 2017. "Identifying emerging research fields: a longitudinal latent semantic keyword analysis," Scientometrics, Springer;Akadémiai Kiadó, vol. 113(3), pages 1757-1785, December.
    10. Silva, F.N. & Rodrigues, F.A. & Oliveira, O.N. & da F. Costa, L., 2013. "Quantifying the interdisciplinarity of scientific journals and fields," Journal of Informetrics, Elsevier, vol. 7(2), pages 469-477.
    11. Lu Liu & Benjamin F. Jones & Brian Uzzi & Dashun Wang, 2023. "Data, measurement and empirical methods in the science of science," Nature Human Behaviour, Nature, vol. 7(7), pages 1046-1058, July.
    12. Li Hou & Qiang Wu & Yundong Xie, 2022. "Does early publishing in top journals really predict long-term scientific success in the business field?," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(11), pages 6083-6107, November.
    13. Balland, Pierre-Alexandre & Boschma, Ron, 2022. "Do scientific capabilities in specific domains matter for technological diversification in European regions?," Research Policy, Elsevier, vol. 51(10).
    14. Gao, Qiang & Liang, Zhentao & Wang, Ping & Hou, Jingrui & Chen, Xiuxiu & Liu, Manman, 2021. "Potential index: Revealing the future impact of research topics based on current knowledge networks," Journal of Informetrics, Elsevier, vol. 15(3).
    15. María Pinto & Rosaura Fernández-Pascual & David Caballero-Mariscal & Dora Sales, 2020. "Information literacy trends in higher education (2006–2019): visualizing the emerging field of mobile information literacy," Scientometrics, Springer;Akadémiai Kiadó, vol. 124(2), pages 1479-1510, August.
    16. Giovanni Abramo & Ciriaco Andrea D'Angelo & Flavia Costa, 2012. "Identifying interdisciplinarity through the disciplinary classification of coauthors of scientific publications," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 63(11), pages 2206-2222, November.
    17. Juan Miguel Campanario, 2018. "Are leaders really leading? Journals that are first in Web of Science subject categories in the context of their groups," Scientometrics, Springer;Akadémiai Kiadó, vol. 115(1), pages 111-130, April.
    18. Jianhua Hou, 2017. "Exploration into the evolution and historical roots of citation analysis by referenced publication year spectroscopy," Scientometrics, Springer;Akadémiai Kiadó, vol. 110(3), pages 1437-1452, March.
    19. Silva, F.N. & Viana, M.P. & Travençolo, B.A.N. & Costa, L. da F., 2011. "Investigating relationships within and between category networks in Wikipedia," Journal of Informetrics, Elsevier, vol. 5(3), pages 431-438.
    20. John McLevey & Alexander V. Graham & Reid McIlroy-Young & Pierson Browne & Kathryn S. Plaisance, 2018. "Interdisciplinarity and insularity in the diffusion of knowledge: an analysis of disciplinary boundaries between philosophy of science and the sciences," Scientometrics, Springer;Akadémiai Kiadó, vol. 117(1), pages 331-349, October.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:scient:v:126:y:2021:i:2:d:10.1007_s11192-020-03789-8. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.