IDEAS home Printed from https://ideas.repec.org/a/bla/jamist/v54y2003i12p1091-1103.html
   My bibliography  Save this article

Overlap in bibliographic databases

Author

Listed:
  • William W. Hood
  • Concepción S. Wilson

Abstract

Bibliographic databases contain surrogates to a particular subset of the complete set of literature; some databases are very narrow in their scope, while others are multidisciplinary. These databases overlap in their coverage of the literature to a greater or lesser extent. The topic of Fuzzy Set Theory is examined to determine the overlap of coverage in the databases that index this topic. It was found that about 63% of records in the data set are unique to only one database, and the remaining 37% are duplicated in from two to 12 different databases. The overlap distribution is found to conform to a Lotka‐type plot. The records with maximum overlap are identified; however, further work is needed to determine the significance of the high level of overlap in these records. The unique records are plotted using a Bradford‐type form of data presentation and are found to conform (visually) to a hyperbolic distribution. The extent and causes of intra‐database duplication (records duplicated in the one database) are also examined. Finally, the overlap in the top databases in the dataset were examined, and a high correlation was found between overlapping records, and overlapping DIALOG OneSearch categories.

Suggested Citation

  • William W. Hood & Concepción S. Wilson, 2003. "Overlap in bibliographic databases," Journal of the American Society for Information Science and Technology, Association for Information Science & Technology, vol. 54(12), pages 1091-1103, October.
  • Handle: RePEc:bla:jamist:v:54:y:2003:i:12:p:1091-1103
    DOI: 10.1002/asi.10301
    as

    Download full text from publisher

    File URL: https://doi.org/10.1002/asi.10301
    Download Restriction: no

    File URL: https://libkey.io/10.1002/asi.10301?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Mehmet Ali Abdulhayoglu & Bart Thijs, 2018. "Use of locality sensitive hashing (LSH) algorithm to match Web of Science and Scopus," Scientometrics, Springer;Akadémiai Kiadó, vol. 116(2), pages 1229-1245, August.
    2. William W. Hood & Concepción S. Wilson, 2003. "Informetric studies using databases: Opportunities and challenges," Scientometrics, Springer;Akadémiai Kiadó, vol. 58(3), pages 587-608, November.
    3. Bar-Ilan, Judit, 2008. "Informetrics at the beginning of the 21st century—A review," Journal of Informetrics, Elsevier, vol. 2(1), pages 1-52.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bla:jamist:v:54:y:2003:i:12:p:1091-1103. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Wiley Content Delivery (email available below). General contact details of provider: http://www.asis.org .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.