IDEAS home Printed from https://ideas.repec.org/a/eee/infome/v11y2017i1p152-163.html
   My bibliography  Save this article

Can we use Google Scholar to identify highly-cited documents?

Author

Listed:
  • Martin-Martin, Alberto
  • Orduna-Malea, Enrique
  • Harzing, Anne-Wil
  • Delgado López-Cózar, Emilio

Abstract

The main objective of this paper is to empirically test whether the identification of highly-cited documents through Google Scholar is feasible and reliable. To this end, we carried out a longitudinal analysis (1950–2013), running a generic query (filtered only by year of publication) to minimise the effects of academic search engine optimisation. This gave us a final sample of 64,000 documents (1000 per year). The strong correlation between a document’s citations and its position in the search results (r=−0.67) led us to conclude that Google Scholar is able to identify highly-cited papers effectively. This, combined with Google Scholar’s unique coverage (no restrictions on document type and source), makes the academic search engine an invaluable tool for bibliometric research relating to the identification of the most influential scientific documents. We find evidence, however, that Google Scholar ranks those documents whose language (or geographical web domain) matches with the user’s interface language higher than could be expected based on citations. Nonetheless, this language effect and other factors related to the Google Scholar’s operation, i.e. the proper identification of versions and the date of publication, only have an incidental impact. They do not compromise the ability of Google Scholar to identify the highly-cited papers.

Suggested Citation

  • Martin-Martin, Alberto & Orduna-Malea, Enrique & Harzing, Anne-Wil & Delgado López-Cózar, Emilio, 2017. "Can we use Google Scholar to identify highly-cited documents?," Journal of Informetrics, Elsevier, vol. 11(1), pages 152-163.
  • Handle: RePEc:eee:infome:v:11:y:2017:i:1:p:152-163
    DOI: 10.1016/j.joi.2016.11.008
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S175115771630298X
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.joi.2016.11.008?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Bornmann, Lutz & Marx, Werner & Schier, Hermann & Rahm, Erhard & Thor, Andreas & Daniel, Hans-Dieter, 2009. "Convergent validity of bibliometric Google Scholar data in the field of chemistry—Citation counts for papers that were accepted by Angewandte Chemie International Edition or rejected but published els," Journal of Informetrics, Elsevier, vol. 3(1), pages 27-35.
    2. Bar-Ilan, Judit & Levene, Mark & Lin, Ayelet, 2007. "Some measures for comparing citation databases," Journal of Informetrics, Elsevier, vol. 1(1), pages 26-34.
    3. Alberto Martín-Martín & Enrique Orduna-Malea & Juan M. Ayllón & Emilio Delgado López-Cózar, 2016. "Back to the past: on the shoulders of an academic search engine giant," Scientometrics, Springer;Akadémiai Kiadó, vol. 107(3), pages 1477-1487, June.
    4. Kayvan Kousha & Mike Thelwall & Somayeh Rezaie, 2011. "Assessing the citation impact of books: The role of Google Books, Google Scholar, and Scopus," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 62(11), pages 2147-2164, November.
    5. Franceschini, Fiorenzo & Maisano, Domenico & Mastrogiacomo, Luca, 2016. "Empirical analysis and classification of database errors in Scopus and Web of Science," Journal of Informetrics, Elsevier, vol. 10(4), pages 933-953.
    6. Anne-Wil Harzing & Satu Alakangas, 2016. "Google Scholar, Scopus and the Web of Science: a longitudinal and cross-disciplinary comparison," Scientometrics, Springer;Akadémiai Kiadó, vol. 106(2), pages 787-804, February.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Jialiang Lin & Yao Yu & Yu Zhou & Zhiyang Zhou & Xiaodong Shi, 2020. "How many preprints have actually been printed and why: a case study of computer science preprints on arXiv," Scientometrics, Springer;Akadémiai Kiadó, vol. 124(1), pages 555-574, July.
    2. Martín-Martín, Alberto & Costas, Rodrigo & van Leeuwen, Thed & Delgado López-Cózar, Emilio, 2018. "Evidence of open access of scientific publications in Google Scholar: A large-scale analysis," Journal of Informetrics, Elsevier, vol. 12(3), pages 819-841.
    3. Manjula Wijewickrema, 2021. "Authors’ perception on abstracting and indexing databases in different subject domains," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(4), pages 3063-3089, April.
    4. Alberto Martín-Martín & Enrique Orduna-Malea & Emilio Delgado López-Cózar, 2018. "A novel method for depicting academic disciplines through Google Scholar Citations: The case of Bibliometrics," Scientometrics, Springer;Akadémiai Kiadó, vol. 114(3), pages 1251-1273, March.
    5. Thelwall, Mike, 2018. "Dimensions: A competitor to Scopus and the Web of Science?," Journal of Informetrics, Elsevier, vol. 12(2), pages 430-435.
    6. Andrea Potgieter & Chris Rensleigh, 2019. "There's a Google Scholar Alert for that: An integrative review methodology exploring mobile app features through Leximancer," Proceedings of Business and Management Conferences 8511148, International Institute of Social and Economic Sciences.
    7. Alberto Martín-Martín & Mike Thelwall & Enrique Orduna-Malea & Emilio Delgado López-Cózar, 2021. "Google Scholar, Microsoft Academic, Scopus, Dimensions, Web of Science, and OpenCitations’ COCI: a multidisciplinary comparison of coverage via citations," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(1), pages 871-906, January.
    8. Michael Gusenbauer, 2019. "Google Scholar to overshadow them all? Comparing the sizes of 12 academic search engines and bibliographic databases," Scientometrics, Springer;Akadémiai Kiadó, vol. 118(1), pages 177-214, January.
    9. Martín-Martín, Alberto & Orduna-Malea, Enrique & Delgado López-Cózar, Emilio, 2018. "Author-level metrics in the new academic profile platforms: The online behaviour of the Bibliometrics community," Journal of Informetrics, Elsevier, vol. 12(2), pages 494-509.
    10. Nataliya N. Matveeva & Oleg V. Poldin, 2017. "How Network Characteristics of Researchers Relate to Their Citation Indicators – a Co-Authorship Network Analysis Based on Google Scholar," HSE Working papers WP BRP 44/EDU/2017, National Research University Higher School of Economics.
    11. Weiwei Yan & Yin Zhang & Wendy Bromfield, 2018. "Analyzing the follower–followee ratio to determine user characteristics and institutional participation differences among research universities on ResearchGate," Scientometrics, Springer;Akadémiai Kiadó, vol. 115(1), pages 299-316, April.
    12. Marko Zdravkovic & Joana Berger-Estilita & Bogdan Zdravkovic & David Berger, 2020. "Scientific quality of COVID-19 and SARS CoV-2 publications in the highest impact medical journals during the early phase of the pandemic: A case control study," PLOS ONE, Public Library of Science, vol. 15(11), pages 1-16, November.
    13. Alberto Martín-Martín & Enrique Orduna-Malea & Emilio Delgado López-Cózar, 2018. "Coverage of highly-cited documents in Google Scholar, Web of Science, and Scopus: a multidisciplinary comparison," Scientometrics, Springer;Akadémiai Kiadó, vol. 116(3), pages 2175-2188, September.
    14. de Carvalho, Gustavo Dambiski Gomes & Sokulski, Carla Cristiane & da Silva, Wesley Vieira & de Carvalho, Hélio Gomes & de Moura, Rafael Vignoli & de Francisco, Antonio Carlos & da Veiga, Claudimar Per, 2020. "Bibliometrics and systematic reviews: A comparison between the Proknow-C and the Methodi Ordinatio," Journal of Informetrics, Elsevier, vol. 14(3).

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Gordana Budimir & Sophia Rahimeh & Sameh Tamimi & Primož Južnič, 2021. "Comparison of self-citation patterns in WoS and Scopus databases based on national scientific production in Slovenia (1996–2020)," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(3), pages 2249-2267, March.
    2. Halevi, Gali & Moed, Henk & Bar-Ilan, Judit, 2017. "Suitability of Google Scholar as a source of scientific information and as a source of data for scientific evaluation—Review of the Literature," Journal of Informetrics, Elsevier, vol. 11(3), pages 823-834.
    3. Anne-Wil Harzing, 2013. "A preliminary test of Google Scholar as a source for citation data: a longitudinal study of Nobel prize winners," Scientometrics, Springer;Akadémiai Kiadó, vol. 94(3), pages 1057-1075, March.
    4. Sergio Copiello, 2019. "The open access citation premium may depend on the openness and inclusiveness of the indexing database, but the relationship is controversial because it is ambiguous where the open access boundary lie," Scientometrics, Springer;Akadémiai Kiadó, vol. 121(2), pages 995-1018, November.
    5. Massimo Aria & Michelangelo Misuraca & Maria Spano, 2020. "Mapping the Evolution of Social Research and Data Science on 30 Years of Social Indicators Research," Social Indicators Research: An International and Interdisciplinary Journal for Quality-of-Life Measurement, Springer, vol. 149(3), pages 803-831, June.
    6. Houqiang Yu & Xueting Cao & Tingting Xiao & Zhenyi Yang, 2020. "How accurate are policy document mentions? A first look at the role of altmetrics database," Scientometrics, Springer;Akadémiai Kiadó, vol. 125(2), pages 1517-1540, November.
    7. Gerson Pech & Catarina Delgado, 2020. "Percentile and stochastic-based approach to the comparison of the number of citations of articles indexed in different bibliographic databases," Scientometrics, Springer;Akadémiai Kiadó, vol. 123(1), pages 223-252, April.
    8. Nisar Ahmad & Amjad Naveed & Shabbir Ahmad & Irfan Butt, 2020. "Banking Sector Performance, Profitability, And Efficiency: A Citation‐Based Systematic Literature Review," Journal of Economic Surveys, Wiley Blackwell, vol. 34(1), pages 185-218, February.
    9. Gerson Pech & Catarina Delgado, 2020. "Assessing the publication impact using citation data from both Scopus and WoS databases: an approach validated in 15 research fields," Scientometrics, Springer;Akadémiai Kiadó, vol. 125(2), pages 909-924, November.
    10. Joost C. F. Winter & Amir A. Zadpoor & Dimitra Dodou, 2014. "The expansion of Google Scholar versus Web of Science: a longitudinal study," Scientometrics, Springer;Akadémiai Kiadó, vol. 98(2), pages 1547-1565, February.
    11. Waltman, Ludo, 2016. "A review of the literature on citation impact indicators," Journal of Informetrics, Elsevier, vol. 10(2), pages 365-391.
    12. Houqiang Yu & Xueting Cao & Tingting Xiao & Zhenyi Yang, 0. "How accurate are policy document mentions? A first look at the role of altmetrics database," Scientometrics, Springer;Akadémiai Kiadó, vol. 0, pages 1-24.
    13. Massimo Franceschet, 2010. "A comparison of bibliometric indicators for computer science scholars and journals on Web of Science and Google Scholar," Scientometrics, Springer;Akadémiai Kiadó, vol. 83(1), pages 243-258, April.
    14. Marjolaine Gautret & Stefano Messori & André Jestin & Marina Bagni & Alain Boissy, 2017. "Development of a semi-automatic bibliometric system for publications on animal health and welfare: a methodological study," Scientometrics, Springer;Akadémiai Kiadó, vol. 113(2), pages 803-823, November.
    15. Moed, Henk F. & Bar-Ilan, Judit & Halevi, Gali, 2016. "A new methodology for comparing Google Scholar and Scopus," Journal of Informetrics, Elsevier, vol. 10(2), pages 533-551.
    16. Daniel Torres-Salinas & Nicolás Robinson-García & Álvaro Cabezas-Clavijo & Evaristo Jiménez-Contreras, 2014. "Analyzing the citation characteristics of books: edited books, book series and publisher types in the book citation index," Scientometrics, Springer;Akadémiai Kiadó, vol. 98(3), pages 2113-2127, March.
    17. Aurelia Magdalena Pisoschi & Claudia Gabriela Pisoschi, 2016. "Is open access the solution to increase the impact of scientific journals?," Scientometrics, Springer;Akadémiai Kiadó, vol. 109(2), pages 1075-1095, November.
    18. Gaviria-Marin, Magaly & Merigó, José M. & Baier-Fuentes, Hugo, 2019. "Knowledge management: A global examination based on bibliometric analysis," Technological Forecasting and Social Change, Elsevier, vol. 140(C), pages 194-220.
    19. Isidro F. Aguillo & Judit Bar-Ilan & Mark Levene & José Luis Ortega, 2010. "Comparing university rankings," Scientometrics, Springer;Akadémiai Kiadó, vol. 85(1), pages 243-256, October.
    20. Vivek Kumar Singh & Prashasti Singh & Mousumi Karmakar & Jacqueline Leta & Philipp Mayr, 2021. "The journal coverage of Web of Science, Scopus and Dimensions: A comparative analysis," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(6), pages 5113-5142, June.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:infome:v:11:y:2017:i:1:p:152-163. See general information about how to correct material in RePEc.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: . General contact details of provider: http://www.elsevier.com/locate/joi .

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/joi .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service hosted by the Research Division of the Federal Reserve Bank of St. Louis . RePEc uses bibliographic data supplied by the respective publishers.