IDEAS home Printed from https://ideas.repec.org/p/mil/wpdepa/2011-36.html

Robust analysis of bibliometric data

Author

Listed:
  • Francesca DE BATTISTI

  • Silvia SALINI

Abstract

The aim of the work is to reproduce the image of the research profile of the Italian statisticians derived from querying of bibliometric databases. We highlighted the need for multiple sources in order to convey a truer picture and how the data could be combined in order to have a classification or an index of overall productivity, which took into account all sources and metrics. The data matrix contains a set of metrics from a variety of databases for each author and it is a sparse matrix (there are many zeros). Furthermore, the variables are leptokurtic and characterized by positive asymmetry. In order to apply the classical techniques of multivariate analysis, the data must be transformed first or alternatively robust analysis techniques have to be used. In the paper we will focus on this type of bibliometric data, describing their main characteristics and problems. In addition, a robust approach to the analysis of these data will be presented.

Suggested Citation

  • Francesca DE BATTISTI & Silvia SALINI, 2011. "Robust analysis of bibliometric data," Departmental Working Papers 2011-36, Department of Economics, Management and Quantitative Methods at Università degli Studi di Milano.
  • Handle: RePEc:mil:wpdepa:2011-36
    as

    Download full text from publisher

    File URL: http://wp.demm.unimi.it/files/wp/2011/DEMM-2011_036wp.pdf
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Marco Geraci & M. Degli Esposti, 2011. "Where do Italian universities stand? An in-depth statistical analysis of national and international rankings," Scientometrics, Springer;Akadémiai Kiadó, vol. 87(3), pages 667-681, June.
    2. Baccini, Alberto & Barabesi, Lucio, 2011. "Seats at the table: The network of the editorial boards in information and library science," Journal of Informetrics, Elsevier, vol. 5(3), pages 382-391.
    3. Filzmoser, Peter & Maronna, Ricardo & Werner, Mark, 2008. "Outlier identification in high dimensions," Computational Statistics & Data Analysis, Elsevier, vol. 52(3), pages 1694-1711, January.
    4. Giulia Rivellini & Ester Rizzi & Susanna Zaccarin, 2006. "The science network in Italian population research: An analysis according to the social network perspective," Scientometrics, Springer;Akadémiai Kiadó, vol. 67(3), pages 407-418, June.
    5. Atkinson, A.C. & Riani, M., 2007. "Exploratory tools for clustering multivariate data," Computational Statistics & Data Analysis, Elsevier, vol. 52(1), pages 272-285, September.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Andrea Cerioli & Domenico Perrotta, 2014. "Robust clustering around regression lines with high density regions," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 8(1), pages 5-26, March.
    2. Cerioli, Andrea & Farcomeni, Alessio & Riani, Marco, 2014. "Strong consistency and robustness of the Forward Search estimator of multivariate location and scatter," Journal of Multivariate Analysis, Elsevier, vol. 126(C), pages 167-183.
    3. Lorna Wildgaard, 2015. "A comparison of 17 author-level bibliometric indicators for researchers in Astronomy, Environmental Science, Philosophy and Public Health in Web of Science and Google Scholar," Scientometrics, Springer;Akadémiai Kiadó, vol. 104(3), pages 873-906, September.
    4. Waleed M. Sweileh & Sa’ed H. Zyoud & Samah W. Al-Jabi & Ansam F. Sawalha, 2014. "Bibliometric analysis of diabetes mellitus research output from Middle Eastern Arab countries during the period (1996–2012)," Scientometrics, Springer;Akadémiai Kiadó, vol. 101(1), pages 819-832, October.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Francesca De Battisti & Silvia Salini, 2013. "Robust analysis of bibliometric data," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 22(2), pages 269-283, June.
    2. G. Zioutas & C. Chatzinakos & T. D. Nguyen & L. Pitsoulis, 2017. "Optimization techniques for multivariate least trimmed absolute deviation estimation," Journal of Combinatorial Optimization, Springer, vol. 34(3), pages 781-797, October.
    3. repec:cte:wsrepe:ws1450804 is not listed on IDEAS
    4. Manuel Goyanes & Luis de-Marcos, 2020. "Academic influence and invisible colleges through editorial board interlocking in communication sciences: a social network analysis of leading journals," Scientometrics, Springer;Akadémiai Kiadó, vol. 123(2), pages 791-811, May.
    5. Guillaume Cabanac, 2012. "Shaping the landscape of research in information systems from the perspective of editorial boards: A scientometric study of 77 leading journals," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 63(5), pages 977-996, May.
    6. Junlong Zhao & Chao Liu & Lu Niu & Chenlei Leng, 2019. "Multiple influential point detection in high dimensional regression spaces," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 81(2), pages 385-408, April.
    7. Van Aelst, S. & Vandervieren, E. & Willems, G., 2012. "A Stahel–Donoho estimator based on huberized outlyingness," Computational Statistics & Data Analysis, Elsevier, vol. 56(3), pages 531-542.
    8. Viviana Egidi & Giulia Rivellini & Michele Antonio Salvatore & Silvia D'Angelo, 2018. "A network approach to studying cause-of-death interrelations," Demographic Research, Max Planck Institute for Demographic Research, Rostock, Germany, vol. 38(16), pages 373-400.
    9. Chung, Hee Cheol & Ahn, Jeongyoun, 2021. "Subspace rotations for high-dimensional outlier detection," Journal of Multivariate Analysis, Elsevier, vol. 183(C).
    10. Luigi Aldieri & Gennaro Guida & Maxim Kotsemir & Concetto Paolo Vinci, 2019. "An investigation of impact of research collaboration on academic performance in Italy," Quality & Quantity: International Journal of Methodology, Springer, vol. 53(4), pages 2003-2040, July.
    11. Jan Kalina & Jan Tichavský, 2022. "The minimum weighted covariance determinant estimator for high-dimensional data," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 16(4), pages 977-999, December.
    12. Marco Riani & Anthony C. Atkinson & Andrea Cerioli, 2009. "Finding an unknown number of multivariate outliers," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 71(2), pages 447-466, April.
    13. P. Navarro-Esteban & J. A. Cuesta-Albertos, 2021. "High-dimensional outlier detection using random projections," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 30(4), pages 908-934, December.
    14. D. Rosadi & P. Filzmoser, 2019. "Robust second-order least-squares estimation for regression models with autoregressive errors," Statistical Papers, Springer, vol. 60(1), pages 105-122, February.
    15. Boente, Graciela & Pires, Ana M. & Rodrigues, Isabel M., 2010. "Detecting influential observations in principal components and common principal components," Computational Statistics & Data Analysis, Elsevier, vol. 54(12), pages 2967-2975, December.
    16. Jack Jewson & David Rossell, 2022. "General Bayesian loss function selection and the use of improper models," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 84(5), pages 1640-1665, November.
    17. Giovanni Abramo & Ciriaco Andrea D’Angelo & Flavia Costa, 2019. "A gender analysis of top scientists’ collaboration behavior: evidence from Italy," Scientometrics, Springer;Akadémiai Kiadó, vol. 120(2), pages 405-418, August.
    18. Ayanendranath Basu & Abhik Ghosh & Maria Jaenada & Leandro Pardo, 2024. "Robust adaptive LASSO in high-dimensional logistic regression," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 33(5), pages 1217-1249, November.
    19. Erkuş, Ekin Can & Purutçuoğlu, Vilda, 2021. "Outlier detection and quasi-periodicity optimization algorithm: Frequency domain based outlier detection (FOD)," European Journal of Operational Research, Elsevier, vol. 291(2), pages 560-574.
    20. Luis de-Marcos & Manuel Goyanes & Adrián Domínguez-Díaz, 2024. "Mapping science through editorial board interlocking: connections and distance between fields of knowledge and institutional affiliations," Scientometrics, Springer;Akadémiai Kiadó, vol. 129(6), pages 3385-3406, June.
    21. Radek Zdeněk & Jana Lososová, 2018. "An analysis of editorial board members’ publication output in agricultural economics and policy journals," Scientometrics, Springer;Akadémiai Kiadó, vol. 117(1), pages 563-578, October.

    More about this item

    Keywords

    ;
    ;
    ;

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:mil:wpdepa:2011-36. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: DEMM Working Papers The email address of this maintainer does not seem to be valid anymore. Please ask DEMM Working Papers to update the entry or send us the correct address (email available below). General contact details of provider: https://edirc.repec.org/data/damilit.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.