IDEAS home Printed from https://ideas.repec.org/a/gam/jpubli/v9y2021i2p15-d530934.html
   My bibliography  Save this article

The Effect of Article Characteristics on Citation Number in a Diachronic Dataset of the Biomedical Literature on Chronic Inflammation: An Analysis by Ensemble Machines

Author

Listed:
  • Carlo Galli

    (Department of Medicine and Surgery, Histology and Embryology Lab, University of Parma, Via Volturno 39, 43126 Parma, Italy)

  • Stefano Guizzardi

    (Department of Medicine and Surgery, Histology and Embryology Lab, University of Parma, Via Volturno 39, 43126 Parma, Italy)

Abstract

Citations are core metrics to gauge the relevance of scientific literature. Identifying features that can predict a high citation count is therefore of primary importance. For the present study, we generated a dataset of 121,640 publications on chronic inflammation from the Scopus database, containing data such as titles, authors, journal, publication date, type of document, type of access and citation count, ranging from 1951 to 2021. Hence we further computed title length, author count, title sentiment score, number of colons, semicolons and question marks in the title and we used these data as predictors in Gradient boosting, Bagging and Random Forest regressors and classifiers. Based on these data, we were able to train these machines, and Gradient Boosting achieved an F1 score of 0.552 on classification. These models agreed that document type, access type and number of authors were the best predicting factors, followed by title length.

Suggested Citation

  • Carlo Galli & Stefano Guizzardi, 2021. "The Effect of Article Characteristics on Citation Number in a Diachronic Dataset of the Biomedical Literature on Chronic Inflammation: An Analysis by Ensemble Machines," Publications, MDPI, vol. 9(2), pages 1-11, April.
  • Handle: RePEc:gam:jpubli:v:9:y:2021:i:2:p:15-:d:530934
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2304-6775/9/2/15/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2304-6775/9/2/15/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Yuh-Shan Ho & Michael Kahn, 2014. "A bibliometric study of highly cited reviews in the Science Citation Index expanded-super-™," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 65(2), pages 372-385, February.
    2. Gunther Eysenbach, 2006. "Citation Advantage of Open Access Articles," Working Papers id:626, eSocialSciences.
    3. Fatemeh Rostami & Asghar Mohammadpoorasl & Mohammad Hajizadeh, 2014. "The effect of characteristics of title on citation rates of articles," Scientometrics, Springer;Akadémiai Kiadó, vol. 98(3), pages 2007-2010, March.
    4. Sarah Rijcke & Alexander Rushforth, 2015. "To intervene or not to intervene; is that the question? On the role of scientometrics in research evaluation," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 66(9), pages 1954-1958, September.
    5. Moshe Yitzhaki, 2002. "Relation of the title length of a journal article to the length of the article," Scientometrics, Springer;Akadémiai Kiadó, vol. 54(3), pages 435-447, July.
    6. Mingers, John & Leydesdorff, Loet, 2015. "A review of theory and practice in scientometrics," European Journal of Operational Research, Elsevier, vol. 246(1), pages 1-19.
    7. Quentin L. Burrell, 2003. "Predicting future citation behavior," Journal of the American Society for Information Science and Technology, Association for Information Science & Technology, vol. 54(5), pages 372-378, March.
    8. Lutz Bornmann & Rüdiger Mutz, 2015. "Growth rates of modern science: A bibliometric analysis based on the number of publications and cited references," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 66(11), pages 2215-2222, November.
    9. Vieira, E.S. & Gomes, J.A.N.F., 2010. "Citations to scientific articles: Its distribution and dependence on the article features," Journal of Informetrics, Elsevier, vol. 4(1), pages 1-13.
    10. Derek R. Smith, 2012. "Impact factors, scientometrics and the history of citation-based research," Scientometrics, Springer;Akadémiai Kiadó, vol. 92(2), pages 419-427, August.
    11. Bai, Xiaomei & Zhang, Fuli & Lee, Ivan, 2019. "Predicting the citations of scholarly paper," Journal of Informetrics, Elsevier, vol. 13(1), pages 407-418.
    12. Clemens Blümel & Alexander Schniedermann, 2020. "Studying review articles in scientometrics and beyond: a research agenda," Scientometrics, Springer;Akadémiai Kiadó, vol. 124(1), pages 711-728, July.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Feng Guo & Chao Ma & Qingling Shi & Qingqing Zong, 2018. "Succinct effect or informative effect: the relationship between title length and the number of citations," Scientometrics, Springer;Akadémiai Kiadó, vol. 116(3), pages 1531-1539, September.
    2. Kong, Ling & Wang, Dongbo, 2020. "Comparison of citations and attention of cover and non-cover papers," Journal of Informetrics, Elsevier, vol. 14(4).
    3. Ruth Zárate-Rueda & Yolima Ivonne Beltrán-Villamizar & Daniella Murallas-Sánchez, 2021. "Social representations of socioenvironmental dynamics in extractive ecosystems and conservation practices with sustainable development: a bibliometric analysis," Environment, Development and Sustainability: A Multidisciplinary Approach to the Theory and Practice of Sustainable Development, Springer, vol. 23(11), pages 16428-16453, November.
    4. Martorell Cunil, Onofre & Otero González, Luis & Durán Santomil, Pablo & Mulet Forteza, Carlos, 2023. "How to accomplish a highly cited paper in the tourism, leisure and hospitality field," Journal of Business Research, Elsevier, vol. 157(C).
    5. Yunxue Cui & Yongzhen Wang & Xiaozhong Liu & Xianwen Wang & Xuhong Zhang, 2023. "Multidimensional scholarly citations: Characterizing and understanding scholars' citation behaviors," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 74(1), pages 115-127, January.
    6. Saarela, Mirka & Kärkkäinen, Tommi, 2020. "Can we automate expert-based journal rankings? Analysis of the Finnish publication indicator," Journal of Informetrics, Elsevier, vol. 14(2).
    7. Petr Praus, 2019. "High-ranked citations percentage as an indicator of publications quality," Scientometrics, Springer;Akadémiai Kiadó, vol. 120(1), pages 319-329, July.
    8. Jiang, Zhuoren & Lin, Tianqianjin & Huang, Cui, 2023. "Deep representation learning of scientific paper reveals its potential scholarly impact," Journal of Informetrics, Elsevier, vol. 17(1).
    9. Elizabeth S. Vieira, 2023. "The influence of research collaboration on citation impact: the countries in the European Innovation Scoreboard," Scientometrics, Springer;Akadémiai Kiadó, vol. 128(6), pages 3555-3579, June.
    10. Chompunuch Saravudecha & Duangruthai Na Thungfai & Chananthida Phasom & Sodsri Gunta-in & Aorrakanya Metha & Peangkobfah Punyaphet & Tippawan Sookruay & Wannachai Sakuludomkan & Nut Koonrungsesomboon, 2023. "Hybrid Gold Open Access Citation Advantage in Clinical Medicine: Analysis of Hybrid Journals in the Web of Science," Publications, MDPI, vol. 11(2), pages 1-9, March.
    11. Tehmina Amjad & Nafeesa Shahid & Ali Daud & Asma Khatoon, 2022. "Citation burst prediction in a bibliometric network," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(5), pages 2773-2790, May.
    12. Xuechun Xiang & Jing Li, 2020. "A diachronic comparative study of research article titles in linguistics and literature journals," Scientometrics, Springer;Akadémiai Kiadó, vol. 122(2), pages 847-866, February.
    13. Zhijun LI & Jinfen XU, 2019. "The evolution of research article titles: the case of Journal of Pragmatics 1978–2018," Scientometrics, Springer;Akadémiai Kiadó, vol. 121(3), pages 1619-1634, December.
    14. Li Hou & Qiang Wu & Yundong Xie, 2022. "Does early publishing in top journals really predict long-term scientific success in the business field?," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(11), pages 6083-6107, November.
    15. Antonio Perianes-Rodríguez & Carlos Olmeda-Gómez, 2019. "Effects of journal choice on the visibility of scientific publications: a comparison between subscription-based and full Open Access models," Scientometrics, Springer;Akadémiai Kiadó, vol. 121(3), pages 1737-1752, December.
    16. Lakshmi Balachandran Nair & Michael Gibbert, 2016. "What makes a ‘good’ title and (how) does it matter for citations? A review and general model of article title attributes in management science," Scientometrics, Springer;Akadémiai Kiadó, vol. 107(3), pages 1331-1359, June.
    17. Clemens Blümel & Stephan Gauch, 2021. "Introduction to special issue: quantitative studies of science in Germany," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(12), pages 9641-9647, December.
    18. Vanclay, Jerome K., 2013. "Factors affecting citation rates in environmental science," Journal of Informetrics, Elsevier, vol. 7(2), pages 265-271.
    19. Sato, Ryoma & Yamada, Makoto & Kashima, Hisashi, 2022. "Poincare: Recommending Publication Venues via Treatment Effect Estimation," Journal of Informetrics, Elsevier, vol. 16(2).
    20. Sergio Copiello, 2019. "The open access citation premium may depend on the openness and inclusiveness of the indexing database, but the relationship is controversial because it is ambiguous where the open access boundary lie," Scientometrics, Springer;Akadémiai Kiadó, vol. 121(2), pages 995-1018, November.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jpubli:v:9:y:2021:i:2:p:15-:d:530934. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.