IDEAS home Printed from
   My bibliography  Save this paper

Robust analysis of bibliometric data


  • Francesca DE BATTISTI


  • Silvia SALINI



The aim of the work is to reproduce the image of the research profile of the Italian statisticians derived from querying of bibliometric databases. We highlighted the need for multiple sources in order to convey a truer picture and how the data could be combined in order to have a classification or an index of overall productivity, which took into account all sources and metrics. The data matrix contains a set of metrics from a variety of databases for each author and it is a sparse matrix (there are many zeros). Furthermore, the variables are leptokurtic and characterized by positive asymmetry. In order to apply the classical techniques of multivariate analysis, the data must be transformed first or alternatively robust analysis techniques have to be used. In the paper we will focus on this type of bibliometric data, describing their main characteristics and problems. In addition, a robust approach to the analysis of these data will be presented.

Suggested Citation

  • Francesca DE BATTISTI & Silvia SALINI, 2011. "Robust analysis of bibliometric data," Departmental Working Papers 2011-36, Department of Economics, Management and Quantitative Methods at Università degli Studi di Milano.
  • Handle: RePEc:mil:wpdepa:2011-36

    Download full text from publisher

    File URL:
    Download Restriction: no

    References listed on IDEAS

    1. repec:spr:scient:v:87:y:2011:i:3:d:10.1007_s11192-011-0350-9 is not listed on IDEAS
    2. Baccini, Alberto & Barabesi, Lucio, 2011. "Seats at the table: The network of the editorial boards in information and library science," Journal of Informetrics, Elsevier, vol. 5(3), pages 382-391.
    3. Filzmoser, Peter & Maronna, Ricardo & Werner, Mark, 2008. "Outlier identification in high dimensions," Computational Statistics & Data Analysis, Elsevier, vol. 52(3), pages 1694-1711, January.
    4. repec:spr:scient:v:67:y:2006:i:3:d:10.1556_scient.67.2006.3.5 is not listed on IDEAS
    5. Atkinson, A.C. & Riani, M., 2007. "Exploratory tools for clustering multivariate data," Computational Statistics & Data Analysis, Elsevier, vol. 52(1), pages 272-285, September.
    Full references (including those not matched with items on IDEAS)


    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.

    Cited by:

    1. repec:spr:scient:v:101:y:2014:i:1:d:10.1007_s11192-014-1361-0 is not listed on IDEAS
    2. repec:spr:scient:v:104:y:2015:i:3:d:10.1007_s11192-015-1608-4 is not listed on IDEAS
    3. Andrea Cerioli & Domenico Perrotta, 2014. "Robust clustering around regression lines with high density regions," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 8(1), pages 5-26, March.


    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:mil:wpdepa:2011-36. See general information about how to correct material in RePEc.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: (DEMM Working Papers). General contact details of provider: .

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service hosted by the Research Division of the Federal Reserve Bank of St. Louis . RePEc uses bibliographic data supplied by the respective publishers.