IDEAS home Printed from https://ideas.repec.org/a/eee/infome/v16y2022i2s1751157722000128.html
   My bibliography  Save this article

On predicting research grants productivity via machine learning

Author

Listed:
  • Tohalino, Jorge A.V.
  • Amancio, Diego R.

Abstract

Understanding the reasons associated with successful proposals are of paramount importance to improve evaluation processes. In this context, we analyzed whether bibliometric features are able to predict the success of research grants. We extracted features aiming at characterizing the academic history of Brazilian researchers, including research topics, affiliations, number of publications and visibility. The extracted features were then used to predict grants productivity via machine learning in three major research areas, namely Medicine, Dentistry and Veterinary Medicine. We found that research subject and publication history play a role in predicting productivity. In addition, institution-based features turned out to be relevant when combined with other features. While the best results outperformed text-based attributes, the evaluated features were not highly discriminative. Our findings indicate that predicting grants success, at least with the considered set of bibliometric features, is not a trivial task.

Suggested Citation

  • Tohalino, Jorge A.V. & Amancio, Diego R., 2022. "On predicting research grants productivity via machine learning," Journal of Informetrics, Elsevier, vol. 16(2).
  • Handle: RePEc:eee:infome:v:16:y:2022:i:2:s1751157722000128
    DOI: 10.1016/j.joi.2022.101260
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S1751157722000128
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.joi.2022.101260?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Mayra Z Rodriguez & Cesar H Comin & Dalcimar Casanova & Odemir M Bruno & Diego R Amancio & Luciano da F Costa & Francisco A Rodrigues, 2019. "Clustering algorithms: A comparative approach," PLOS ONE, Public Library of Science, vol. 14(1), pages 1-34, January.
    2. Lu, Chao & Bu, Yi & Dong, Xianlei & Wang, Jie & Ding, Ying & Larivière, Vincent & Sugimoto, Cassidy R. & Paul, Logan & Zhang, Chengzhi, 2019. "Analyzing linguistic complexity and scientific impact," Journal of Informetrics, Elsevier, vol. 13(3), pages 817-829.
    3. Diego Raphael Amancio & Cesar Henrique Comin & Dalcimar Casanova & Gonzalo Travieso & Odemir Martinez Bruno & Francisco Aparecido Rodrigues & Luciano da Fontoura Costa, 2014. "A Systematic Comparison of Supervised Classifiers," PLOS ONE, Public Library of Science, vol. 9(4), pages 1-14, April.
    4. Ismael Rafols & Martin Meyer, 2010. "Diversity and network coherence as indicators of interdisciplinarity: case studies in bionanoscience," Scientometrics, Springer;Akadémiai Kiadó, vol. 82(2), pages 263-287, February.
    5. Franceschet, Massimo & Costantini, Antonio, 2010. "The effect of scholar collaboration on impact and quality of academic papers," Journal of Informetrics, Elsevier, vol. 4(4), pages 540-553.
    6. Min, Chao & Bu, Yi & Sun, Jianjun, 2021. "Predicting scientific breakthroughs based on knowledge structure variations," Technological Forecasting and Social Change, Elsevier, vol. 164(C).
    7. He, Zekai & Zhen, Ni & Wu, Chaojiang, 2019. "Measuring and exploring the geographic mobility of American professors from graduating institutions: Differences across disciplines, academic ranks, and genders," Journal of Informetrics, Elsevier, vol. 13(3), pages 771-784.
    8. Corrêa Jr., Edilson A. & Silva, Filipi N. & da F. Costa, Luciano & Amancio, Diego R., 2017. "Patterns of authors contribution in scientific manuscripts," Journal of Informetrics, Elsevier, vol. 11(2), pages 498-510.
    9. Jorge A. V. Tohalino & Laura V. C. Quispe & Diego R. Amancio, 2021. "Analyzing the relationship between text features and grants productivity," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(5), pages 4255-4275, May.
    10. Letchford, Adrian & Preis, Tobias & Moat, Helen Susannah, 2016. "The advantage of simple paper abstracts," Journal of Informetrics, Elsevier, vol. 10(1), pages 1-8.
    11. Diego R. Amancio & Osvaldo N. Oliveira jr & Luciano F. Costa, 2015. "Topological-collaborative approach for disambiguating authors’ names in collaborative networks," Scientometrics, Springer;Akadémiai Kiadó, vol. 102(1), pages 465-485, January.
    12. Judit Bar-Ilan, 2008. "The h-index of h-index and of other informetric topics," Scientometrics, Springer;Akadémiai Kiadó, vol. 75(3), pages 591-605, June.
    13. Daniel E. Acuna & Stefano Allesina & Konrad P. Kording, 2012. "Predicting scientific success," Nature, Nature, vol. 489(7415), pages 201-202, September.
    14. Silva, Filipi N. & Amancio, Diego R. & Bardosova, Maria & Costa, Luciano da F. & Oliveira, Osvaldo N., 2016. "Using network science and text analytics to produce surveys in a scientific topic," Journal of Informetrics, Elsevier, vol. 10(2), pages 487-502.
    15. Grzegorz Siudem & Barbara Żogała-Siudem & Anna Cena & Marek Gagolewski, 2020. "Three dimensions of scientific impact," Proceedings of the National Academy of Sciences, Proceedings of the National Academy of Sciences, vol. 117(25), pages 13896-13900, June.
    16. Schreiber, Michael, 2013. "How relevant is the predictive power of the h-index? A case study of the time-dependent Hirsch index," Journal of Informetrics, Elsevier, vol. 7(2), pages 325-329.
    17. Yang Wang & Benjamin F. Jones & Dashun Wang, 2019. "Early-career setback and future career impact," Nature Communications, Nature, vol. 10(1), pages 1-10, December.
    18. Kevin W. Boyack & Caleb Smith & Richard Klavans, 2018. "Toward predicting research proposal success," Scientometrics, Springer;Akadémiai Kiadó, vol. 114(2), pages 449-461, February.
    19. John P A Ioannidis & Kevin W Boyack & Jeroen Baas, 2020. "Updated science-wide author databases of standardized citation indicators," PLOS Biology, Public Library of Science, vol. 18(10), pages 1-3, October.
    20. Ed J. Rinia & Thed N. van Leeuwen & Anthony F. J. van Raan, 2002. "Impact measures of interdisciplinary research in physics," Scientometrics, Springer;Akadémiai Kiadó, vol. 53(2), pages 241-248, February.
    21. Ponomarev, Ilya V. & Williams, Duane E. & Hackett, Charles J. & Schnell, Joshua D. & Haak, Laurel L., 2014. "Predicting highly cited papers: A Method for Early Detection of Candidate Breakthroughs," Technological Forecasting and Social Change, Elsevier, vol. 81(C), pages 49-55.
    22. Ole Ellegaard & Johan A. Wallin, 2015. "The bibliometric analysis of scholarly production: How great is the impact?," Scientometrics, Springer;Akadémiai Kiadó, vol. 105(3), pages 1809-1831, December.
    23. Tohalino, Jorge V. & Amancio, Diego R., 2018. "Extractive multi-document summarization using multilayer networks," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 503(C), pages 526-539.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Hren, Darko & Pina, David G. & Norman, Christopher R. & Marušić, Ana, 2022. "What makes or breaks competitive research proposals? A mixed-methods analysis of research grant evaluation reports," Journal of Informetrics, Elsevier, vol. 16(2).

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Jorge A. V. Tohalino & Laura V. C. Quispe & Diego R. Amancio, 2021. "Analyzing the relationship between text features and grants productivity," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(5), pages 4255-4275, May.
    2. Corrêa, Edilson A. & Marinho, Vanessa Q. & Amancio, Diego R., 2020. "Semantic flow in language networks discriminates texts by genre and publication date," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 557(C).
    3. Ana C. M. Brito & Filipi N. Silva & Diego R. Amancio, 2023. "Analyzing the influence of prolific collaborations on authors productivity and visibility," Scientometrics, Springer;Akadémiai Kiadó, vol. 128(4), pages 2471-2487, April.
    4. Xian Li & Ronald Rousseau & Liming Liang & Fangjie Xi & Yushuang Lü & Yifan Yuan & Xiaojun Hu, 2022. "Is low interdisciplinarity of references an unexpected characteristic of Nobel Prize winning research?," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(4), pages 2105-2122, April.
    5. Brito, Ana C.M. & Silva, Filipi N. & de Arruda, Henrique F. & Comin, Cesar H. & Amancio, Diego R. & Costa, Luciano da F., 2021. "Classification of abrupt changes along viewing profiles of scientific articles," Journal of Informetrics, Elsevier, vol. 15(2).
    6. Andrea Bonaccorsi & Nicola Melluso & Francesco Alessandro Massucci, 2022. "Exploring the antecedents of interdisciplinarity at the European Research Council: a topic modeling approach," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(12), pages 6961-6991, December.
    7. Brito, Ana C.M. & Silva, Filipi N. & Amancio, Diego R., 2021. "Associations between author-level metrics in subsequent time periods," Journal of Informetrics, Elsevier, vol. 15(4).
    8. Shiji Chen & Clément Arsenault & Yves Gingras & Vincent Larivière, 2015. "Exploring the interdisciplinary evolution of a discipline: the case of Biochemistry and Molecular Biology," Scientometrics, Springer;Akadémiai Kiadó, vol. 102(2), pages 1307-1323, February.
    9. Gregorio González-Alcaide, 2021. "Bibliometric studies outside the information science and library science field: uncontainable or uncontrollable?," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(8), pages 6837-6870, August.
    10. Shiji Chen & Yanhui Song & Fei Shu & Vincent Larivière, 2022. "Interdisciplinarity and impact: the effects of the citation time window," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(5), pages 2621-2642, May.
    11. Edson Melo Souza & Jose Eduardo Storopoli & Wonder Alexandre Luz Alves, 2022. "Scientific Contribution List Categories Investigation: a comparison between three mainstream medical journals," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(5), pages 2249-2276, May.
    12. Jeong, Yoo Kyung & Xie, Qing & Yan, Erjia & Song, Min, 2020. "Examining drug and side effect relation using author–entity pair bipartite networks," Journal of Informetrics, Elsevier, vol. 14(1).
    13. Samreen Ayaz & Nayyer Masood & Muhammad Arshad Islam, 2018. "Predicting scientific impact based on h-index," Scientometrics, Springer;Akadémiai Kiadó, vol. 114(3), pages 993-1010, March.
    14. Adilson Vital & Diego R. Amancio, 2022. "A comparative analysis of local similarity metrics and machine learning approaches: application to link prediction in author citation networks," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(10), pages 6011-6028, October.
    15. Shiyun Wang & Yaxue Ma & Jin Mao & Yun Bai & Zhentao Liang & Gang Li, 2023. "Quantifying scientific breakthroughs by a novel disruption indicator based on knowledge entities," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 74(2), pages 150-167, February.
    16. Wanjun Xia & Tianrui Li & Chongshou Li, 2023. "A review of scientific impact prediction: tasks, features and methods," Scientometrics, Springer;Akadémiai Kiadó, vol. 128(1), pages 543-585, January.
    17. Fei Shu & Jesse David Dinneen & Shiji Chen, 2022. "Measuring the disparity among scientific disciplines using Library of Congress Subject Headings," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(6), pages 3613-3628, June.
    18. Andrea Fronzetti Colladon & Ciriaco Andrea D’Angelo & Peter A. Gloor, 2020. "Predicting the future success of scientific publications through social network and semantic analysis," Scientometrics, Springer;Akadémiai Kiadó, vol. 124(1), pages 357-377, July.
    19. Chen, Shiji & Qiu, Junping & Arsenault, Clément & Larivière, Vincent, 2021. "Exploring the interdisciplinarity patterns of highly cited papers," Journal of Informetrics, Elsevier, vol. 15(1).
    20. Ferraz de Arruda, Henrique & Reia, Sandro Martinelli & Silva, Filipi Nascimento & Amancio, Diego Raphael & da Fontoura Costa, Luciano, 2022. "Finding contrasting patterns in rhythmic properties between prose and poetry," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 598(C).

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:infome:v:16:y:2022:i:2:s1751157722000128. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/joi .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.