IDEAS home Printed from https://ideas.repec.org/a/eee/infome/v11y2017i1p152-163.html
   My bibliography  Save this article

Can we use Google Scholar to identify highly-cited documents?

Author

Listed:
  • Martin-Martin, Alberto
  • Orduna-Malea, Enrique
  • Harzing, Anne-Wil
  • Delgado López-Cózar, Emilio

Abstract

The main objective of this paper is to empirically test whether the identification of highly-cited documents through Google Scholar is feasible and reliable. To this end, we carried out a longitudinal analysis (1950–2013), running a generic query (filtered only by year of publication) to minimise the effects of academic search engine optimisation. This gave us a final sample of 64,000 documents (1000 per year). The strong correlation between a document’s citations and its position in the search results (r=−0.67) led us to conclude that Google Scholar is able to identify highly-cited papers effectively. This, combined with Google Scholar’s unique coverage (no restrictions on document type and source), makes the academic search engine an invaluable tool for bibliometric research relating to the identification of the most influential scientific documents. We find evidence, however, that Google Scholar ranks those documents whose language (or geographical web domain) matches with the user’s interface language higher than could be expected based on citations. Nonetheless, this language effect and other factors related to the Google Scholar’s operation, i.e. the proper identification of versions and the date of publication, only have an incidental impact. They do not compromise the ability of Google Scholar to identify the highly-cited papers.

Suggested Citation

  • Martin-Martin, Alberto & Orduna-Malea, Enrique & Harzing, Anne-Wil & Delgado López-Cózar, Emilio, 2017. "Can we use Google Scholar to identify highly-cited documents?," Journal of Informetrics, Elsevier, vol. 11(1), pages 152-163.
  • Handle: RePEc:eee:infome:v:11:y:2017:i:1:p:152-163
    DOI: 10.1016/j.joi.2016.11.008
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S175115771630298X
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.joi.2016.11.008?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Judit Bar-Ilan, 2010. "Citations to the “Introduction to informetrics” indexed by WOS, Scopus and Google Scholar," Scientometrics, Springer;Akadémiai Kiadó, vol. 82(3), pages 495-506, March.
    2. Anne-Wil Harzing, 2013. "A preliminary test of Google Scholar as a source for citation data: a longitudinal study of Nobel prize winners," Scientometrics, Springer;Akadémiai Kiadó, vol. 94(3), pages 1057-1075, March.
    3. Anne-Wil Harzing & Satu Alakangas, 2016. "Google Scholar, Scopus and the Web of Science: a longitudinal and cross-disciplinary comparison," Scientometrics, Springer;Akadémiai Kiadó, vol. 106(2), pages 787-804, February.
    4. Declan Butler, 2004. "Science searches shift up a gear as Google starts Scholar engine," Nature, Nature, vol. 432(7016), pages 423-423, November.
    5. Joost C. F. Winter & Amir A. Zadpoor & Dimitra Dodou, 2014. "The expansion of Google Scholar versus Web of Science: a longitudinal study," Scientometrics, Springer;Akadémiai Kiadó, vol. 98(2), pages 1547-1565, February.
    6. Isidro F. Aguillo, 2012. "Is Google Scholar useful for bibliometrics? A webometric analysis," Scientometrics, Springer;Akadémiai Kiadó, vol. 91(2), pages 343-351, May.
    7. Kayvan Kousha & Mike Thelwall & Somayeh Rezaie, 2011. "Assessing the citation impact of books: The role of Google Books, Google Scholar, and Scopus," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 62(11), pages 2147-2164, November.
    8. Anne-Wil Harzing, 2014. "A longitudinal study of Google Scholar coverage between 2012 and 2013," Scientometrics, Springer;Akadémiai Kiadó, vol. 98(1), pages 565-575, January.
    9. Jim Giles, 2005. "Start your engines," Nature, Nature, vol. 438(7068), pages 554-555, December.
    10. Kayvan Kousha & Mike Thelwall, 2008. "Sources of Google Scholar citations outside the Science Citation Index: A comparison between four science disciplines," Scientometrics, Springer;Akadémiai Kiadó, vol. 74(2), pages 273-294, February.
    11. Bornmann, Lutz & Marx, Werner & Schier, Hermann & Rahm, Erhard & Thor, Andreas & Daniel, Hans-Dieter, 2009. "Convergent validity of bibliometric Google Scholar data in the field of chemistry—Citation counts for papers that were accepted by Angewandte Chemie International Edition or rejected but published els," Journal of Informetrics, Elsevier, vol. 3(1), pages 27-35.
    12. Lokman I. Meho & Kiduk Yang, 2007. "Impact of data sources on citation counts and rankings of LIS faculty: Web of science versus scopus and google scholar," Journal of the American Society for Information Science and Technology, Association for Information Science & Technology, vol. 58(13), pages 2105-2125, November.
    13. Bar-Ilan, Judit & Levene, Mark & Lin, Ayelet, 2007. "Some measures for comparing citation databases," Journal of Informetrics, Elsevier, vol. 1(1), pages 26-34.
    14. Franceschini, Fiorenzo & Maisano, Domenico & Mastrogiacomo, Luca, 2016. "Empirical analysis and classification of database errors in Scopus and Web of Science," Journal of Informetrics, Elsevier, vol. 10(4), pages 933-953.
    15. Richard Van Noorden, 2014. "Online collaboration: Scientists and the social network," Nature, Nature, vol. 512(7513), pages 126-129, August.
    16. Kayvan Kousha & Mike Thelwall & Somayeh Rezaie, 2011. "Assessing the citation impact of books: The role of Google Books, Google Scholar, and Scopus," Journal of the American Society for Information Science and Technology, Association for Information Science & Technology, vol. 62(11), pages 2147-2164, November.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Jialiang Lin & Yao Yu & Yu Zhou & Zhiyang Zhou & Xiaodong Shi, 2020. "How many preprints have actually been printed and why: a case study of computer science preprints on arXiv," Scientometrics, Springer;Akadémiai Kiadó, vol. 124(1), pages 555-574, July.
    2. Michael Gusenbauer, 2019. "Google Scholar to overshadow them all? Comparing the sizes of 12 academic search engines and bibliographic databases," Scientometrics, Springer;Akadémiai Kiadó, vol. 118(1), pages 177-214, January.
    3. Martín-Martín, Alberto & Orduna-Malea, Enrique & Delgado López-Cózar, Emilio, 2018. "Author-level metrics in the new academic profile platforms: The online behaviour of the Bibliometrics community," Journal of Informetrics, Elsevier, vol. 12(2), pages 494-509.
    4. Maria Siranova & Karol Zelenak, 2023. "Every crisis does matter: Comparing the databases of financial crisis events," Review of International Economics, Wiley Blackwell, vol. 31(2), pages 652-686, May.
    5. Martín-Martín, Alberto & Costas, Rodrigo & van Leeuwen, Thed & Delgado López-Cózar, Emilio, 2018. "Evidence of open access of scientific publications in Google Scholar: A large-scale analysis," Journal of Informetrics, Elsevier, vol. 12(3), pages 819-841.
    6. Nataliya N. Matveeva & Oleg V. Poldin, 2017. "How Network Characteristics of Researchers Relate to Their Citation Indicators – a Co-Authorship Network Analysis Based on Google Scholar," HSE Working papers WP BRP 44/EDU/2017, National Research University Higher School of Economics.
    7. Weiwei Yan & Yin Zhang & Wendy Bromfield, 2018. "Analyzing the follower–followee ratio to determine user characteristics and institutional participation differences among research universities on ResearchGate," Scientometrics, Springer;Akadémiai Kiadó, vol. 115(1), pages 299-316, April.
    8. Manjula Wijewickrema, 2021. "Authors’ perception on abstracting and indexing databases in different subject domains," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(4), pages 3063-3089, April.
    9. Alberto Martín-Martín & Enrique Orduna-Malea & Emilio Delgado López-Cózar, 2018. "A novel method for depicting academic disciplines through Google Scholar Citations: The case of Bibliometrics," Scientometrics, Springer;Akadémiai Kiadó, vol. 114(3), pages 1251-1273, March.
    10. Bernd W. Wirtz & Jan C. Weyerer & Marcel Becker & Wilhelm M. Müller, 2022. "Open government data: A systematic literature review of empirical research," Electronic Markets, Springer;IIM University of St. Gallen, vol. 32(4), pages 2381-2404, December.
    11. Casey Eaton & Amanda Banks & Kristin Weger & Bryan Mesmer & Robert Moreland, 2023. "Understanding perceived influencers on project outcomes and quantifying disciplinary similarities in academic literature," Systems Research and Behavioral Science, Wiley Blackwell, vol. 40(3), pages 460-487, May.
    12. Alberto Martín-Martín & Enrique Orduna-Malea & Emilio Delgado López-Cózar, 2018. "Coverage of highly-cited documents in Google Scholar, Web of Science, and Scopus: a multidisciplinary comparison," Scientometrics, Springer;Akadémiai Kiadó, vol. 116(3), pages 2175-2188, September.
    13. de Carvalho, Gustavo Dambiski Gomes & Sokulski, Carla Cristiane & da Silva, Wesley Vieira & de Carvalho, Hélio Gomes & de Moura, Rafael Vignoli & de Francisco, Antonio Carlos & da Veiga, Claudimar Per, 2020. "Bibliometrics and systematic reviews: A comparison between the Proknow-C and the Methodi Ordinatio," Journal of Informetrics, Elsevier, vol. 14(3).
    14. Bo-Christer Björk & Sari Kanto-Karvonen & J. Tuomas Harviainen, 2020. "How Frequently Are Articles in Predatory Open Access Journals Cited," Publications, MDPI, vol. 8(2), pages 1-12, March.
    15. Thelwall, Mike, 2018. "Dimensions: A competitor to Scopus and the Web of Science?," Journal of Informetrics, Elsevier, vol. 12(2), pages 430-435.
    16. Andrea Potgieter & Chris Rensleigh, 2019. "There's a Google Scholar Alert for that: An integrative review methodology exploring mobile app features through Leximancer," Proceedings of Business and Management Conferences 8511148, International Institute of Social and Economic Sciences.
    17. Vivek Kumar Singh & Satya Swarup Srichandan & Hiran H. Lathabai, 2022. "ResearchGate and Google Scholar: how much do they differ in publications, citations and different metrics and why?," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(3), pages 1515-1542, March.
    18. Alberto Martín-Martín & Mike Thelwall & Enrique Orduna-Malea & Emilio Delgado López-Cózar, 2021. "Google Scholar, Microsoft Academic, Scopus, Dimensions, Web of Science, and OpenCitations’ COCI: a multidisciplinary comparison of coverage via citations," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(1), pages 871-906, January.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Halevi, Gali & Moed, Henk & Bar-Ilan, Judit, 2017. "Suitability of Google Scholar as a source of scientific information and as a source of data for scientific evaluation—Review of the Literature," Journal of Informetrics, Elsevier, vol. 11(3), pages 823-834.
    2. Waltman, Ludo, 2016. "A review of the literature on citation impact indicators," Journal of Informetrics, Elsevier, vol. 10(2), pages 365-391.
    3. Moed, Henk F. & Bar-Ilan, Judit & Halevi, Gali, 2016. "A new methodology for comparing Google Scholar and Scopus," Journal of Informetrics, Elsevier, vol. 10(2), pages 533-551.
    4. Sergio Copiello, 2019. "The open access citation premium may depend on the openness and inclusiveness of the indexing database, but the relationship is controversial because it is ambiguous where the open access boundary lie," Scientometrics, Springer;Akadémiai Kiadó, vol. 121(2), pages 995-1018, November.
    5. Joost C. F. Winter & Amir A. Zadpoor & Dimitra Dodou, 2014. "The expansion of Google Scholar versus Web of Science: a longitudinal study," Scientometrics, Springer;Akadémiai Kiadó, vol. 98(2), pages 1547-1565, February.
    6. Anne-Wil Harzing, 2013. "A preliminary test of Google Scholar as a source for citation data: a longitudinal study of Nobel prize winners," Scientometrics, Springer;Akadémiai Kiadó, vol. 94(3), pages 1057-1075, March.
    7. Martín-Martín, Alberto & Orduna-Malea, Enrique & Thelwall, Mike & Delgado López-Cózar, Emilio, 2018. "Google Scholar, Web of Science, and Scopus: A systematic comparison of citations in 252 subject categories," Journal of Informetrics, Elsevier, vol. 12(4), pages 1160-1177.
    8. Antonio Cavacini, 2015. "What is the best database for computer science journal articles?," Scientometrics, Springer;Akadémiai Kiadó, vol. 102(3), pages 2059-2071, March.
    9. Hamid R. Jamali & Majid Nabavi, 2015. "Open access and sources of full-text articles in Google Scholar in different subject fields," Scientometrics, Springer;Akadémiai Kiadó, vol. 105(3), pages 1635-1651, December.
    10. Michael Gusenbauer, 2019. "Google Scholar to overshadow them all? Comparing the sizes of 12 academic search engines and bibliographic databases," Scientometrics, Springer;Akadémiai Kiadó, vol. 118(1), pages 177-214, January.
    11. Alberto Martín-Martín & Enrique Orduna-Malea & Emilio Delgado López-Cózar, 2018. "Coverage of highly-cited documents in Google Scholar, Web of Science, and Scopus: a multidisciplinary comparison," Scientometrics, Springer;Akadémiai Kiadó, vol. 116(3), pages 2175-2188, September.
    12. Enrique Orduna-Malea & Juan M. Ayllón & Alberto Martín-Martín & Emilio Delgado López-Cózar, 2015. "Methods for estimating the size of Google Scholar," Scientometrics, Springer;Akadémiai Kiadó, vol. 104(3), pages 931-949, September.
    13. Enrique Orduna-Malea & Selenay Aytac & Clara Y. Tran, 2019. "Universities through the eyes of bibliographic databases: a retroactive growth comparison of Google Scholar, Scopus and Web of Science," Scientometrics, Springer;Akadémiai Kiadó, vol. 121(1), pages 433-450, October.
    14. Michael Gusenbauer, 2022. "Search where you will find most: Comparing the disciplinary coverage of 56 bibliographic databases," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(5), pages 2683-2745, May.
    15. Cristòfol Rovira & Lluís Codina & Frederic Guerrero-Solé & Carlos Lopezosa, 2019. "Ranking by Relevance and Citation Counts, a Comparative Study: Google Scholar, Microsoft Academic, WoS and Scopus," Future Internet, MDPI, vol. 11(9), pages 1-21, September.
    16. Cristòfol Rovira & Lluís Codina & Carlos Lopezosa, 2021. "Language Bias in the Google Scholar Ranking Algorithm," Future Internet, MDPI, vol. 13(2), pages 1-17, January.
    17. Vivek Kumar Singh & Satya Swarup Srichandan & Hiran H. Lathabai, 2022. "ResearchGate and Google Scholar: how much do they differ in publications, citations and different metrics and why?," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(3), pages 1515-1542, March.
    18. Zhang, Chengzhi & Zhou, Qingqing, 2020. "Assessing books’ depth and breadth via multi-level mining on tables of contents," Journal of Informetrics, Elsevier, vol. 14(2).
    19. Massimo Franceschet, 2010. "A comparison of bibliometric indicators for computer science scholars and journals on Web of Science and Google Scholar," Scientometrics, Springer;Akadémiai Kiadó, vol. 83(1), pages 243-258, April.
    20. John Mingers & Martin Meyer, 2017. "Normalizing Google Scholar data for use in research evaluation," Scientometrics, Springer;Akadémiai Kiadó, vol. 112(2), pages 1111-1121, August.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:infome:v:11:y:2017:i:1:p:152-163. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/joi .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.