IDEAS home Printed from https://ideas.repec.org/a/spr/scient/v93y2012i3d10.1007_s11192-012-0810-x.html
   My bibliography  Save this article

Ten challenges in modeling bibliographic data for bibliometric analysis

Author

Listed:
  • Alfio Ferrara

    (Università degli Studi di Milano)

  • Silvia Salini

    (Università degli Studi di Milano)

Abstract

The complexity and variety of bibliographic data is growing, and efforts to define new methodologies and techniques for bibliometric analysis are intensifying. In this complex scenario, one of the most crucial issues is the quality of data and the capability of bibliometric analysis to cope with multiple data dimensions. Although the problem of enforcing a multidimensional approach to the analysis and management of bibliographic data is not new, a reference design pattern and a specific conceptual model for multidimensional analysis of bibliographic data are still missing. In this paper, we discuss ten of the most relevant challenges for bibliometric analysis when dealing with multidimensional data, and we propose a reference data model that, according to different goals, can help analysis designers and bibliographic experts in working with large collections of bibliographic data.

Suggested Citation

  • Alfio Ferrara & Silvia Salini, 2012. "Ten challenges in modeling bibliographic data for bibliometric analysis," Scientometrics, Springer;Akadémiai Kiadó, vol. 93(3), pages 765-785, December.
  • Handle: RePEc:spr:scient:v:93:y:2012:i:3:d:10.1007_s11192-012-0810-x
    DOI: 10.1007/s11192-012-0810-x
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s11192-012-0810-x
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s11192-012-0810-x?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Jean-Francois Molinari & Alain Molinari, 2008. "A new methodology for ranking scientific institutions," Scientometrics, Springer;Akadémiai Kiadó, vol. 75(1), pages 163-174, April.
    2. Mallig, Nicolai, 2010. "A relational database for bibliometric analysis," Journal of Informetrics, Elsevier, vol. 4(4), pages 564-580.
    3. Hamish Coates, 2007. "Universities on the Catwalk: Models for Performance Ranking in Australia," Higher Education Management and Policy, OECD Publishing, vol. 19(2), pages 1-17.
    4. Michael Greenacre, 2008. "Correspondence analysis of raw data," Economics Working Papers 1112, Department of Economics and Business, Universitat Pompeu Fabra, revised Jul 2009.
    5. Mallig, Nicolai, 2010. "A relational database for bibliometric analysis," Discussion Papers "Innovation Systems and Policy Analysis" 22, Fraunhofer Institute for Systems and Innovation Research (ISI).
    6. Emil Hudomalj & Gaj Vidmar, 2003. "OLAP and bibliographic databases," Scientometrics, Springer;Akadémiai Kiadó, vol. 58(3), pages 609-622, November.
    7. Marco Geraci & M. Degli Esposti, 2011. "Where do Italian universities stand? An in-depth statistical analysis of national and international rankings," Scientometrics, Springer;Akadémiai Kiadó, vol. 87(3), pages 667-681, June.
    8. Dietmar Wolfram, 2006. "Applications of SQL for informetric frequency distribution processing," Scientometrics, Springer;Akadémiai Kiadó, vol. 67(2), pages 301-313, May.
    9. Scott Deerwester & Susan T. Dumais & George W. Furnas & Thomas K. Landauer & Richard Harshman, 1990. "Indexing by latent semantic analysis," Journal of the American Society for Information Science, Association for Information Science & Technology, vol. 41(6), pages 391-407, September.
    10. Yu, Hairong & Davis, Mari & Wilson, Concepción S. & Cole, Fletcher T.H., 2008. "Object-relational data modelling for informetric databases," Journal of Informetrics, Elsevier, vol. 2(3), pages 240-251.
    11. Massimo Franceschet, 2009. "A cluster analysis of scholar and journal bibliometric indicators," Journal of the American Society for Information Science and Technology, Association for Information Science & Technology, vol. 60(10), pages 1950-1964, October.
    12. Teh, Yee Whye & Jordan, Michael I. & Beal, Matthew J. & Blei, David M., 2006. "Hierarchical Dirichlet Processes," Journal of the American Statistical Association, American Statistical Association, vol. 101, pages 1566-1581, December.
    13. Lokman I. Meho & Kiduk Yang, 2007. "Impact of data sources on citation counts and rankings of LIS faculty: Web of science versus scopus and google scholar," Journal of the American Society for Information Science and Technology, Association for Information Science & Technology, vol. 58(13), pages 2105-2125, November.
    14. Benito Bonito, Mónica & Romera Ayllón, María Rosario, 2011. "Improving quality assessment of composite indicators in university rankings: a case study of French and German universities of excellence," DES - Working Papers. Statistics and Econometrics. WS ws112015, Universidad Carlos III de Madrid. Departamento de Estadística.
    15. Ron S. Kenett & Silvia Salini, 2011. "Modern analysis of customer satisfaction surveys: comparison of models and integrated analysis," Applied Stochastic Models in Business and Industry, John Wiley & Sons, vol. 27(5), pages 465-475, September.
    16. Wolfgang Glänzel & András Schubert, 2003. "A new classification scheme of science fields and subfields designed for scientometric evaluation purposes," Scientometrics, Springer;Akadémiai Kiadó, vol. 56(3), pages 357-367, March.
    17. J. Hubert, 1977. "Bibliometric models for journal productivity," Social Indicators Research: An International and Interdisciplinary Journal for Quality-of-Life Measurement, Springer, vol. 4(1), pages 441-473, January.
    18. M. Benito & R. Romera, 2011. "Improving quality assessment of composite indicators in university rankings: a case study of French and German universities of excellence," Scientometrics, Springer;Akadémiai Kiadó, vol. 89(1), pages 153-176, October.
    19. Harvey Goldstein & David J. Spiegelhalter, 1996. "League Tables and Their Limitations: Statistical Issues in Comparisons of Institutional Performance," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 159(3), pages 385-409, May.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Jeong, Yujin & Park, Inchae & Yoon, Byungun, 2019. "Identifying emerging Research and Business Development (R&BD) areas based on topic modeling and visualization with intellectual property right data," Technological Forecasting and Social Change, Elsevier, vol. 146(C), pages 655-672.
    2. Chyi-Kwei Yau & Alan Porter & Nils Newman & Arho Suominen, 2014. "Clustering scientific documents with topic modeling," Scientometrics, Springer;Akadémiai Kiadó, vol. 100(3), pages 767-786, September.
    3. Sabine Loudcher & Wararat Jakawat & Edmundo Pavel Soriano Morales & Cécile Favre, 2015. "Combining OLAP and information networks for bibliographic data analysis: a survey," Scientometrics, Springer;Akadémiai Kiadó, vol. 103(2), pages 471-487, May.
    4. Massimo FLORIO & Francesco GIFFONI, 2019. "L’impatto sociale della produzione di scienza su larga scala: come governarlo?," Departmental Working Papers 2019-05, Department of Economics, Management and Quantitative Methods at Università degli Studi di Milano.
    5. Chen, Guo & Xiao, Lu, 2016. "Selecting publication keywords for domain analysis in bibliometrics: A comparison of three methods," Journal of Informetrics, Elsevier, vol. 10(1), pages 212-223.
    6. Francesca De Battisti & Silvia Salini, 2013. "Robust analysis of bibliometric data," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 22(2), pages 269-283, June.
    7. Francesca De Battisti & Alfio Ferrara & Silvia Salini, 2015. "A decade of research in statistics: a topic model approach," Scientometrics, Springer;Akadémiai Kiadó, vol. 103(2), pages 413-433, May.
    8. Bornmann, Lutz, 2019. "Does the normalized citation impact of universities profit from certain properties of their published documents – such as the number of authors and the impact factor of the publishing journals? A mult," Journal of Informetrics, Elsevier, vol. 13(1), pages 170-184.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Guillaume Cabanac, 2012. "Shaping the landscape of research in information systems from the perspective of editorial boards: A scientometric study of 77 leading journals," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 63(5), pages 977-996, May.
    2. Guillaume Cabanac, 2012. "Shaping the landscape of research in information systems from the perspective of editorial boards: A scientometric study of 77 leading journals," Journal of the American Society for Information Science and Technology, Association for Information Science & Technology, vol. 63(5), pages 977-996, May.
    3. Bar-Ilan, Judit, 2008. "Informetrics at the beginning of the 21st century—A review," Journal of Informetrics, Elsevier, vol. 2(1), pages 1-52.
    4. Guillaume Cabanac, 2013. "Experimenting with the partnership ability φ-index on a million computer scientists," Scientometrics, Springer;Akadémiai Kiadó, vol. 96(1), pages 1-9, July.
    5. John Panaretos & Chrisovaladis Malesios, 2009. "Assessing scientific research performance and impact with single indices," Scientometrics, Springer;Akadémiai Kiadó, vol. 81(3), pages 635-670, December.
    6. Gagolewski, Marek, 2011. "Bibliometric impact assessment with R and the CITAN package," Journal of Informetrics, Elsevier, vol. 5(4), pages 678-692.
    7. Parul Khurana & Kiran Sharma, 2022. "Impact of h-index on author’s rankings: an improvement to the h-index for lower-ranked authors," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(8), pages 4483-4498, August.
    8. Loizides, Orestis-Stavros & Koutsakis, Polychronis, 2017. "On evaluating the quality of a computer science/computer engineering conference," Journal of Informetrics, Elsevier, vol. 11(2), pages 541-552.
    9. Michel Zitt, 2015. "Meso-level retrieval: IR-bibliometrics interplay and hybrid citation-words methods in scientific fields delineation," Scientometrics, Springer;Akadémiai Kiadó, vol. 102(3), pages 2223-2245, March.
    10. Giménez, Víctor & Thieme, Claudio & Prior, Diego & Tortosa-Ausina, Emili, 2022. "Evaluation and determinants of preschool effectiveness in Chile," Socio-Economic Planning Sciences, Elsevier, vol. 81(C).
    11. Petridis, Konstantinos & Malesios, Chrisovalantis & Arabatzis, Garyfallos & Thanassoulis, Emmanuel, 2013. "Efficiency analysis of forestry journals: Suggestions for improving journals’ quality," Journal of Informetrics, Elsevier, vol. 7(2), pages 505-521.
    12. Ruiz, Francisco & El Gibari, Samira & Cabello, José M. & Gómez, Trinidad, 2020. "MRP-WSCI: Multiple reference point based weak and strong composite indicators," Omega, Elsevier, vol. 95(C).
    13. Tokmachev, Andrey M., 2023. "Hidden scales in statistics of citation indicators," Journal of Informetrics, Elsevier, vol. 17(1).
    14. Rosalia Castellano & Antonella Rocca, 2015. "Assessing the gender gap in labour market index: volatility of results and reliability," International Journal of Social Economics, Emerald Group Publishing Limited, vol. 42(8), pages 749-772, August.
    15. Javier Ruiz-Castillo, 2012. "The evaluation of citation distributions," SERIEs: Journal of the Spanish Economic Association, Springer;Spanish Economic Association, vol. 3(1), pages 291-310, March.
    16. Mallig, Nicolai, 2010. "A relational database for bibliometric analysis," Journal of Informetrics, Elsevier, vol. 4(4), pages 564-580.
    17. Fernanda Morillo & Ignacio Santabárbara & Javier Aparicio, 2013. "The automatic normalisation challenge: detailed addresses identification," Scientometrics, Springer;Akadémiai Kiadó, vol. 95(3), pages 953-966, June.
    18. Elio Atenógenes Villaseñor & Ricardo Arencibia-Jorge & Humberto Carrillo-Calvet, 2017. "Multiparametric characterization of scientometric performance profiles assisted by neural networks: a study of Mexican higher education institutions," Scientometrics, Springer;Akadémiai Kiadó, vol. 110(1), pages 77-104, January.
    19. Cova, Tânia F.G.G. & Jarmelo, Susana & Formosinho, Sebastião J. & de Melo, J. Sérgio Seixas & Pais, Alberto A.C.C., 2015. "Unsupervised characterization of research institutions with task-force estimation," Journal of Informetrics, Elsevier, vol. 9(1), pages 59-68.
    20. Bornmann, Lutz & Mutz, Rüdiger & Hug, Sven E. & Daniel, Hans-Dieter, 2011. "A multilevel meta-analysis of studies reporting correlations between the h index and 37 different h index variants," Journal of Informetrics, Elsevier, vol. 5(3), pages 346-359.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:scient:v:93:y:2012:i:3:d:10.1007_s11192-012-0810-x. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.