IDEAS home Printed from https://ideas.repec.org/a/spr/scient/v129y2024i4d10.1007_s11192-024-04969-6.html
   My bibliography  Save this article

Identification and causal analysis of predatory open access journals based on interpretable machine learning

Author

Listed:
  • Jinhong Wu

    (Wuhan Textile University)

  • Tianye Liu

    (Wuhan Textile University)

  • Keliang Mu

    (Wuhan Textile University)

  • Lei Zhou

    (Wuhan Textile University)

Abstract

Predatory journals have been a recent phenomenon, drawing attention from the academic community in the last decade. However, as the open access (OA) movement has gained momentum, the indiscriminate growth of predatory journals has had significant negative impacts on academic communication, scholarly publishing, and effective utilization of scientific resources. This rampant growth poses a serious threat to the healthy development of the OA movement and also undermines the integrity of research and the research ecosystem. Identifying predatory journals from the massive number of OA journals would assist scholars in evading negative consequences in areas of monetary investment, reputation, academic influence, and occupational advancement. Traditional methods for identifying predatory journals have relied heavily on the knowledge of domain experts. However, a large number of predatory journals exhibit latent and covert characteristics, and the growth rate of OA journals is extremely rapid, making it difficult for experts to identify these predatory journals from the vast number of OA journals. This paper proposes an interpretable machine learning model for early warning of predatory OA journals, which identifies predatory journals through the ensemble of multiple machine learning algorithms. Specifically, the proposed methodology first constructs an OA journal early warning indicator system and integrates multiple machine learning algorithms to compute the early warning values of OA journals. Then, the SHAP interpretable framework is introduced to analyze the causal factors of the early warning risks in a novel way. To verify the accuracy of the model's causal factors, we conduct a comparative analysis of domestic and foreign medical OA journals using case studies. The empirical analysis conducted in this study demonstrates the efficacy of the ensemble algorithm in accurately identifying the risk of predatory OA journals.

Suggested Citation

  • Jinhong Wu & Tianye Liu & Keliang Mu & Lei Zhou, 2024. "Identification and causal analysis of predatory open access journals based on interpretable machine learning," Scientometrics, Springer;Akadémiai Kiadó, vol. 129(4), pages 2131-2158, April.
  • Handle: RePEc:spr:scient:v:129:y:2024:i:4:d:10.1007_s11192-024-04969-6
    DOI: 10.1007/s11192-024-04969-6
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s11192-024-04969-6
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s11192-024-04969-6?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Henk F. Moed, 2011. "The source normalized impact per paper is a valid and sophisticated indicator of journal citation impact," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 62(1), pages 211-213, January.
    2. Zahid Halim & Shafaq Khan, 2019. "A data science-based framework to categorize academic journals," Scientometrics, Springer;Akadémiai Kiadó, vol. 119(1), pages 393-423, April.
    3. Henk F. Moed, 2011. "The source normalized impact per paper is a valid and sophisticated indicator of journal citation impact," Journal of the American Society for Information Science and Technology, Association for Information Science & Technology, vol. 62(1), pages 211-213, January.
    4. Tian, Yunpei & Li, Gang & Mao, Jin, 2023. "Predicting the evolution of scientific communities by interpretable machine learning approaches," Journal of Informetrics, Elsevier, vol. 17(2).
    5. Lucie Beranová & Marcin P. Joachimiak & Tomáš Kliegr & Gollam Rabby & Vilém Sklenák, 2022. "Why was this cited? Explainable machine learning applied to COVID-19 research literature," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(5), pages 2313-2349, May.
    6. Declan Butler, 2013. "Investigating journals: The dark side of publishing," Nature, Nature, vol. 495(7442), pages 433-435, March.
    7. Pajić, Dejan, 2015. "On the stability of citation-based journal rankings," Journal of Informetrics, Elsevier, vol. 9(4), pages 990-1006.
    8. Xianlei Dong & Johan Bollen, 2015. "Computational Models of Consumer Confidence from Large-Scale Online Attention Data: Crowd-Sourcing Econometrics," PLOS ONE, Public Library of Science, vol. 10(3), pages 1-18, March.
    9. Declan Butler, 2008. "Free journal-ranking tool enters citation market," Nature, Nature, vol. 451(7174), pages 6-6, January.
    10. Liang-xing Su & Peng-hui Lyu & Zheng Yang & Shuai Ding & Kai-le Zhou, 2015. "Scientometric cognitive and evaluation on smart city related construction and building journals data," Scientometrics, Springer;Akadémiai Kiadó, vol. 105(1), pages 449-470, October.
    11. Rongying Zhao & Xu Wang, 2019. "Evaluation and comparison of influence in international Open Access journals between China and USA," Scientometrics, Springer;Akadémiai Kiadó, vol. 120(3), pages 1091-1110, September.
    12. Lutz Bornmann & Hans-Dieter Daniel, 2005. "Does the h-index for ranking of scientists really work?," Scientometrics, Springer;Akadémiai Kiadó, vol. 65(3), pages 391-392, December.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Mingkun Wei, 2020. "Research on impact evaluation of open access journals," Scientometrics, Springer;Akadémiai Kiadó, vol. 122(2), pages 1027-1049, February.
    2. Wolfgang Glänzel & Henk F. Moed, 2013. "Opinion paper: thoughts and facts on bibliometric indicators," Scientometrics, Springer;Akadémiai Kiadó, vol. 96(1), pages 381-394, July.
    3. Fiorenzo Franceschini & Domenico Maisano & Luca Mastrogiacomo, 2014. "The citer-success-index: a citer-based indicator to select a subset of elite papers," Scientometrics, Springer;Akadémiai Kiadó, vol. 101(2), pages 963-983, November.
    4. Laura Vana & Ronald Hochreiter & Kurt Hornik, 2016. "Computing a journal meta-ranking using paired comparisons and adaptive lasso estimators," Scientometrics, Springer;Akadémiai Kiadó, vol. 106(1), pages 229-251, January.
    5. Dašić Predrag, 2015. "State and Analysis of Scientific Journals in the Field of “Economic Sciences” for the Period 1995-2014," Economic Themes, Sciendo, vol. 53(4), pages 547-581, December.
    6. Zhou, Ping & Leydesdorff, Loet, 2011. "Fractional counting of citations in research evaluation: A cross- and interdisciplinary assessment of the Tsinghua University in Beijing," Journal of Informetrics, Elsevier, vol. 5(3), pages 360-368.
    7. Henk F. Moed, 2016. "Comprehensive indicator comparisons intelligible to non-experts: the case of two SNIP versions," Scientometrics, Springer;Akadémiai Kiadó, vol. 106(1), pages 51-65, January.
    8. Zahid Halim & Shafaq Khan, 2019. "A data science-based framework to categorize academic journals," Scientometrics, Springer;Akadémiai Kiadó, vol. 119(1), pages 393-423, April.
    9. Franceschini, Fiorenzo & Maisano, Domenico, 2014. "Sub-field normalization of the IEEE scientific journals based on their connection with Technical Societies," Journal of Informetrics, Elsevier, vol. 8(3), pages 508-533.
    10. Waltman, Ludo & van Eck, Nees Jan & van Leeuwen, Thed N. & Visser, Martijn S., 2013. "Some modifications to the SNIP journal impact indicator," Journal of Informetrics, Elsevier, vol. 7(2), pages 272-285.
    11. Lin Feng & Jian Zhou & Sheng-Lan Liu & Ning Cai & Jie Yang, 2020. "Analysis of journal evaluation indicators: an experimental study based on unsupervised Laplacian score," Scientometrics, Springer;Akadémiai Kiadó, vol. 124(1), pages 233-254, July.
    12. Fiorenzo Franceschini & Domenico Maisano & Luca Mastrogiacomo, 2015. "Influence of omitted citations on the bibliometric statistics of the major Manufacturing journals," Scientometrics, Springer;Akadémiai Kiadó, vol. 103(3), pages 1083-1122, June.
    13. Chen, Kun & Ren, Xian-tong & Yang, Guo-liang, 2021. "A novel approach for assessing academic journals: Application of integer DEA model for management science and operations research field," Journal of Informetrics, Elsevier, vol. 15(3).
    14. Bouyssou, Denis & Marchant, Thierry, 2016. "Ranking authors using fractional counting of citations: An axiomatic approach," Journal of Informetrics, Elsevier, vol. 10(1), pages 183-199.
    15. David A. Pendlebury, 2019. "Charting a path between the simple and the false and the complex and unusable: Review of Henk F. Moed, Applied Evaluative Informetrics [in the series Qualitative and Quantitative Analysis of Scientifi," Scientometrics, Springer;Akadémiai Kiadó, vol. 119(1), pages 549-560, April.
    16. Rosenthal, Edward C. & Weiss, Howard J., 2017. "A data envelopment analysis approach for ranking journals," Omega, Elsevier, vol. 70(C), pages 135-147.
    17. Xiang Li & Chengli Zhao & Zhaolong Hu & Caixia Yu & Xiaojun Duan, 2022. "Revealing the character of journals in higher-order citation networks," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(11), pages 6315-6338, November.
    18. Deming Lin & Tianhui Gong & Wenbin Liu & Martin Meyer, 2020. "An entropy-based measure for the evolution of h index research," Scientometrics, Springer;Akadémiai Kiadó, vol. 125(3), pages 2283-2298, December.
    19. Dejian Yu & Wanru Wang & Shuai Zhang & Wenyu Zhang & Rongyu Liu, 2017. "A multiple-link, mutually reinforced journal-ranking model to measure the prestige of journals," Scientometrics, Springer;Akadémiai Kiadó, vol. 111(1), pages 521-542, April.
    20. Lina Xu & Steven Dellaportas & Zhiqiang Yang & Jin Wang, 2023. "More on the relationship between interdisciplinary accounting research and citation impact," Accounting and Finance, Accounting and Finance Association of Australia and New Zealand, vol. 63(4), pages 4779-4803, December.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:scient:v:129:y:2024:i:4:d:10.1007_s11192-024-04969-6. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.