IDEAS home Printed from https://ideas.repec.org/a/gam/jpubli/v9y2021i2p15-d530934.html
   My bibliography  Save this article

The Effect of Article Characteristics on Citation Number in a Diachronic Dataset of the Biomedical Literature on Chronic Inflammation: An Analysis by Ensemble Machines

Author

Listed:
  • Carlo Galli

    (Department of Medicine and Surgery, Histology and Embryology Lab, University of Parma, Via Volturno 39, 43126 Parma, Italy)

  • Stefano Guizzardi

    (Department of Medicine and Surgery, Histology and Embryology Lab, University of Parma, Via Volturno 39, 43126 Parma, Italy)

Abstract

Citations are core metrics to gauge the relevance of scientific literature. Identifying features that can predict a high citation count is therefore of primary importance. For the present study, we generated a dataset of 121,640 publications on chronic inflammation from the Scopus database, containing data such as titles, authors, journal, publication date, type of document, type of access and citation count, ranging from 1951 to 2021. Hence we further computed title length, author count, title sentiment score, number of colons, semicolons and question marks in the title and we used these data as predictors in Gradient boosting, Bagging and Random Forest regressors and classifiers. Based on these data, we were able to train these machines, and Gradient Boosting achieved an F1 score of 0.552 on classification. These models agreed that document type, access type and number of authors were the best predicting factors, followed by title length.

Suggested Citation

  • Carlo Galli & Stefano Guizzardi, 2021. "The Effect of Article Characteristics on Citation Number in a Diachronic Dataset of the Biomedical Literature on Chronic Inflammation: An Analysis by Ensemble Machines," Publications, MDPI, vol. 9(2), pages 1-11, April.
  • Handle: RePEc:gam:jpubli:v:9:y:2021:i:2:p:15-:d:530934
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2304-6775/9/2/15/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2304-6775/9/2/15/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Yuh-Shan Ho & Michael Kahn, 2014. "A bibliometric study of highly cited reviews in the Science Citation Index expanded-super-™," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 65(2), pages 372-385, February.
    2. Gunther Eysenbach, 2006. "Citation Advantage of Open Access Articles," Working Papers id:626, eSocialSciences.
    3. Quentin L. Burrell, 2003. "Predicting future citation behavior," Journal of the American Society for Information Science and Technology, Association for Information Science & Technology, vol. 54(5), pages 372-378, March.
    4. Lutz Bornmann & Rüdiger Mutz, 2015. "Growth rates of modern science: A bibliometric analysis based on the number of publications and cited references," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 66(11), pages 2215-2222, November.
    5. Vieira, E.S. & Gomes, J.A.N.F., 2010. "Citations to scientific articles: Its distribution and dependence on the article features," Journal of Informetrics, Elsevier, vol. 4(1), pages 1-13.
    6. Fatemeh Rostami & Asghar Mohammadpoorasl & Mohammad Hajizadeh, 2014. "The effect of characteristics of title on citation rates of articles," Scientometrics, Springer;Akadémiai Kiadó, vol. 98(3), pages 2007-2010, March.
    7. Derek R. Smith, 2012. "Impact factors, scientometrics and the history of citation-based research," Scientometrics, Springer;Akadémiai Kiadó, vol. 92(2), pages 419-427, August.
    8. Sarah Rijcke & Alexander Rushforth, 2015. "To intervene or not to intervene; is that the question? On the role of scientometrics in research evaluation," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 66(9), pages 1954-1958, September.
    9. Moshe Yitzhaki, 2002. "Relation of the title length of a journal article to the length of the article," Scientometrics, Springer;Akadémiai Kiadó, vol. 54(3), pages 435-447, July.
    10. Mingers, John & Leydesdorff, Loet, 2015. "A review of theory and practice in scientometrics," European Journal of Operational Research, Elsevier, vol. 246(1), pages 1-19.
    11. Bai, Xiaomei & Zhang, Fuli & Lee, Ivan, 2019. "Predicting the citations of scholarly paper," Journal of Informetrics, Elsevier, vol. 13(1), pages 407-418.
    12. Clemens Blümel & Alexander Schniedermann, 2020. "Studying review articles in scientometrics and beyond: a research agenda," Scientometrics, Springer;Akadémiai Kiadó, vol. 124(1), pages 711-728, July.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Feng Guo & Chao Ma & Qingling Shi & Qingqing Zong, 2018. "Succinct effect or informative effect: the relationship between title length and the number of citations," Scientometrics, Springer;Akadémiai Kiadó, vol. 116(3), pages 1531-1539, September.
    2. Lakshmi Balachandran Nair & Michael Gibbert, 2016. "What makes a ‘good’ title and (how) does it matter for citations? A review and general model of article title attributes in management science," Scientometrics, Springer;Akadémiai Kiadó, vol. 107(3), pages 1331-1359, June.
    3. Clemens Blümel & Stephan Gauch, 2021. "Introduction to special issue: quantitative studies of science in Germany," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(12), pages 9641-9647, December.
    4. Kong, Ling & Wang, Dongbo, 2020. "Comparison of citations and attention of cover and non-cover papers," Journal of Informetrics, Elsevier, vol. 14(4).
    5. Vanclay, Jerome K., 2013. "Factors affecting citation rates in environmental science," Journal of Informetrics, Elsevier, vol. 7(2), pages 265-271.
    6. Sato, Ryoma & Yamada, Makoto & Kashima, Hisashi, 2022. "Poincare: Recommending Publication Venues via Treatment Effect Estimation," Journal of Informetrics, Elsevier, vol. 16(2).
    7. Sergio Copiello, 2019. "The open access citation premium may depend on the openness and inclusiveness of the indexing database, but the relationship is controversial because it is ambiguous where the open access boundary lie," Scientometrics, Springer;Akadémiai Kiadó, vol. 121(2), pages 995-1018, November.
    8. Ruth Zárate-Rueda & Yolima Ivonne Beltrán-Villamizar & Daniella Murallas-Sánchez, 2021. "Social representations of socioenvironmental dynamics in extractive ecosystems and conservation practices with sustainable development: a bibliometric analysis," Environment, Development and Sustainability: A Multidisciplinary Approach to the Theory and Practice of Sustainable Development, Springer, vol. 23(11), pages 16428-16453, November.
    9. Martorell Cunil, Onofre & Otero González, Luis & Durán Santomil, Pablo & Mulet Forteza, Carlos, 2023. "How to accomplish a highly cited paper in the tourism, leisure and hospitality field," Journal of Business Research, Elsevier, vol. 157(C).
    10. Andrea Fronzetti Colladon & Ciriaco Andrea D’Angelo & Peter A. Gloor, 2020. "Predicting the future success of scientific publications through social network and semantic analysis," Scientometrics, Springer;Akadémiai Kiadó, vol. 124(1), pages 357-377, July.
    11. Iman Tahamtan & Askar Safipour Afshar & Khadijeh Ahamdzadeh, 2016. "Factors affecting number of citations: a comprehensive review of the literature," Scientometrics, Springer;Akadémiai Kiadó, vol. 107(3), pages 1195-1225, June.
    12. Shengzhi Huang & Jiajia Qian & Yong Huang & Wei Lu & Yi Bu & Jinqing Yang & Qikai Cheng, 2022. "Disclosing the relationship between citation structure and future impact of a publication," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 73(7), pages 1025-1042, July.
    13. Ruan, Xuanmin & Zhu, Yuanyang & Li, Jiang & Cheng, Ying, 2020. "Predicting the citation counts of individual papers via a BP neural network," Journal of Informetrics, Elsevier, vol. 14(3).
    14. Dehdarirad, Tahereh & Nasini, Stefano, 2017. "Research impact in co-authorship networks: a two-mode analysis," Journal of Informetrics, Elsevier, vol. 11(2), pages 371-388.
    15. Hale Turhan Damar & Ozlem Bilik & Guzin Ozdagoglu & Askin Ozdagoglu & Muhammet Damar, 2018. "Evaluating the nursing academicians in Turkey in the scope of Web of Science: scientometrics of original articles," Scientometrics, Springer;Akadémiai Kiadó, vol. 115(1), pages 539-562, April.
    16. Fan, Lingxu & Guo, Lei & Wang, Xinhua & Xu, Liancheng & Liu, Fangai, 2022. "Does the author’s collaboration mode lead to papers’ different citation impacts? An empirical analysis based on propensity score matching," Journal of Informetrics, Elsevier, vol. 16(4).
    17. Yunxue Cui & Yongzhen Wang & Xiaozhong Liu & Xianwen Wang & Xuhong Zhang, 2023. "Multidimensional scholarly citations: Characterizing and understanding scholars' citation behaviors," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 74(1), pages 115-127, January.
    18. Gian Maria Campedelli, 2021. "Where are we? Using Scopus to map the literature at the intersection between artificial intelligence and research on crime," Journal of Computational Social Science, Springer, vol. 4(2), pages 503-530, November.
    19. Fosso Wamba, Samuel & Bawack, Ransome Epie & Guthrie, Cameron & Queiroz, Maciel M. & Carillo, Kevin Daniel André, 2021. "Are we preparing for a good AI society? A bibliometric review and research agenda," Technological Forecasting and Social Change, Elsevier, vol. 164(C).
    20. Saarela, Mirka & Kärkkäinen, Tommi, 2020. "Can we automate expert-based journal rankings? Analysis of the Finnish publication indicator," Journal of Informetrics, Elsevier, vol. 14(2).

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jpubli:v:9:y:2021:i:2:p:15-:d:530934. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.