IDEAS home Printed from https://ideas.repec.org/a/eee/infome/v4y2010i4p483-491.html

Word co-occurrences on Webpages as a measure of the relatedness of organizations: A new Webometrics concept

Author

Listed:
  • Vaughan, Liwen
  • You, Justin

Abstract

Web hyperlink analysis has been a key topic of Webometric research. However, inlink data collection from commercial search engines has been limited to only one source in recent years, which is not a promising prospect for the future development of the field. We need to tap into other Web data sources and to develop new methods. Toward this end, we propose a new Webometrics concept that is based on words rather than inlinks on Webpages. We propose that word co-occurrences on Webpages can be a measure of the relatedness of organizations. Word co-occurrence data can be collected from both general search engines and blog search engines, which expands data sources greatly. The proposed concept is tested in a group of companies in the LTE and WiMax sectors of the telecommunications industry. Data on the co-occurrences of company names on Webpages were collected from Google and Google Blog. The co-occurrence matrices were analyzed using MDS. The resulting MDS maps were compared with industry reality and with the MDS maps from co-link analysis. Results show that Web co-word analysis could potentially be as useful as Web co-link analysis. Google Blog seems to be a better source than Google for co-word data collection.

Suggested Citation

  • Vaughan, Liwen & You, Justin, 2010. "Word co-occurrences on Webpages as a measure of the relatedness of organizations: A new Webometrics concept," Journal of Informetrics, Elsevier, vol. 4(4), pages 483-491.
  • Handle: RePEc:eee:infome:v:4:y:2010:i:4:p:483-491
    DOI: 10.1016/j.joi.2010.04.005
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S1751157710000386
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.joi.2010.04.005?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to

    for a different version of it.

    References listed on IDEAS

    as
    1. Mike Thelwall, 2008. "Extracting accurate and complete results from search engines: Case study windows live," Journal of the American Society for Information Science and Technology, Association for Information Science & Technology, vol. 59(1), pages 38-50, January.
    2. Nancy C. M. Ross & Dietmar Wolfram, 2000. "End user searching on the Internet: An analysis of term pair topics submitted to the Excite search engine," Journal of the American Society for Information Science, Association for Information Science & Technology, vol. 51(10), pages 949-958.
    3. Loet Leydesdorff & Liwen Vaughan, 2006. "Co‐occurrence matrices and their applications in information science: Extending ACA to the Web environment," Journal of the American Society for Information Science and Technology, Association for Information Science & Technology, vol. 57(12), pages 1616-1628, October.
    4. Liwen Vaughan & Justin You, 2008. "Content assisted web co-link analysis for competitive intelligence," Scientometrics, Springer;Akadémiai Kiadó, vol. 77(3), pages 433-444, December.
    5. Liwen Vaughan, 2006. "Visualizing linguistic and cultural differences using Web co‐link data," Journal of the American Society for Information Science and Technology, Association for Information Science & Technology, vol. 57(9), pages 1178-1193, July.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Yang, Chao & Huang, Cui & Su, Jun, 2018. "An improved SAO network-based method for technology trend analysis: A case study of graphene," Journal of Informetrics, Elsevier, vol. 12(1), pages 271-286.
    2. Thelwall, Mike & Sud, Pardeep, 2012. "Webometric research with the Bing Search API 2.0," Journal of Informetrics, Elsevier, vol. 6(1), pages 44-52.
    3. Patrick Kenekayoro & Kevan Buckley & Mike Thelwall, 2015. "Clustering research group website homepages," Scientometrics, Springer;Akadémiai Kiadó, vol. 102(3), pages 2023-2039, March.
    4. Rongying Zhao & Bikun Chen, 2014. "Applying author co-citation analysis to user interaction analysis: a case study on instant messaging groups," Scientometrics, Springer;Akadémiai Kiadó, vol. 101(2), pages 985-997, November.
    5. David Gunnarsson Lorentzen, 2014. "Webometrics benefitting from web mining? An investigation of methods and applications of two research fields," Scientometrics, Springer;Akadémiai Kiadó, vol. 99(2), pages 409-445, May.
    6. Liwen Vaughan & Esteban Romero-Frías, 2012. "Exploring Web keyword analysis as an alternative to link analysis: a multi-industry case," Scientometrics, Springer;Akadémiai Kiadó, vol. 93(1), pages 217-232, October.
    7. Krzysztof Janc, 2015. "Visibility and Connections among Cities in Digital Space," Journal of Urban Technology, Taylor & Francis Journals, vol. 22(4), pages 3-21, October.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. van Eck, N.J.P. & Waltman, L., 2009. "How to Normalize Co-Occurrence Data? An Analysis of Some Well-Known Similarity Measures," ERIM Report Series Research in Management ERS-2009-001-LIS, Erasmus Research Institute of Management (ERIM), ERIM is the joint research institute of the Rotterdam School of Management, Erasmus University and the Erasmus School of Economics (ESE) at Erasmus University Rotterdam.
    2. Bar-Ilan, Judit, 2008. "Informetrics at the beginning of the 21st century—A review," Journal of Informetrics, Elsevier, vol. 2(1), pages 1-52.
    3. Jose Luis Ortega & Isidro Aguillo & Viv Cothey & Andrea Scharnhorst, 2008. "Maps of the academic web in the European Higher Education Area — an exploration of visual web indicators," Scientometrics, Springer;Akadémiai Kiadó, vol. 74(2), pages 295-308, February.
    4. Loet Leydesdorff & Dieter Franz Kogler & Bowen Yan, 2017. "Mapping patent classifications: portfolio and statistical analysis, and the comparison of strengths and weaknesses," Scientometrics, Springer;Akadémiai Kiadó, vol. 112(3), pages 1573-1591, September.
    5. Thelwall, Mike & Sud, Pardeep, 2012. "Webometric research with the Bing Search API 2.0," Journal of Informetrics, Elsevier, vol. 6(1), pages 44-52.
    6. Simone Belli & Carlos Gonzalo-Penela, 2020. "Science, research, and innovation infospheres in Google results of the Ibero-American countries," Scientometrics, Springer;Akadémiai Kiadó, vol. 123(2), pages 635-653, May.
    7. Raphaël Maucuer & Alexandre Renaud & Sébastien Ronteau & Laurent Muzellec, 2022. "What can we learn from marketers? A bibliometric analysis of the marketing literature on business model research," Post-Print hal-03718522, HAL.
    8. Jesper W. Schneider & Birger Larsen & Peter Ingwersen, 2009. "A comparative study of first and all-author co-citation counting, and two different matrix generation approaches applied for author co-citation analyses," Scientometrics, Springer;Akadémiai Kiadó, vol. 80(1), pages 103-130, July.
    9. Jimi Adams & Ryan Light, 2014. "Mapping Interdisciplinary Fields: Efficiencies, Gaps and Redundancies in HIV/AIDS Research," PLOS ONE, Public Library of Science, vol. 9(12), pages 1-13, December.
    10. Georg Groh & Christoph Fuchs, 2011. "Multi-modal social networks for modeling scientific fields," Scientometrics, Springer;Akadémiai Kiadó, vol. 89(2), pages 569-590, November.
    11. Wei-Feng Tung & Ting-Yu Lee, 2013. "Rank-mediated collaborative tagging recommendation service using video-tag relationship prediction," Information Systems Frontiers, Springer, vol. 15(4), pages 627-635, September.
    12. Hao Wang & Sanhong Deng & Xinning Su, 2016. "A study on construction and analysis of discipline knowledge structure of Chinese LIS based on CSSCI," Scientometrics, Springer;Akadémiai Kiadó, vol. 109(3), pages 1725-1759, December.
    13. Judit Bar-Ilan, 2001. "Data collection methods on the Web for infometric purposes — A review and analysis," Scientometrics, Springer;Akadémiai Kiadó, vol. 50(1), pages 7-32, January.
    14. Michel Zitt, 2015. "Meso-level retrieval: IR-bibliometrics interplay and hybrid citation-words methods in scientific fields delineation," Scientometrics, Springer;Akadémiai Kiadó, vol. 102(3), pages 2223-2245, March.
    15. Carlos Sánchez‐Camacho & Rocío Carranza & David Martín‐Consuegra & Estrella Díaz, 2022. "Evolution, trends and future research lines in corporate social responsibility and tourism: A bibliometric analysis and science mapping," Sustainable Development, John Wiley & Sons, Ltd., vol. 30(3), pages 462-476, June.
    16. Shakibian, Hadi & Charkari, Nasrollah Moghadam, 2018. "Statistical similarity measures for link prediction in heterogeneous complex networks," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 501(C), pages 248-263.
    17. Zhao, Dangzhi & Strotmann, Andreas, 2008. "Comparing all-author and first-author co-citation analyses of information science," Journal of Informetrics, Elsevier, vol. 2(3), pages 229-239.
    18. Guangtong Li & L. Siddharth & Jianxi Luo, 2023. "Embedding knowledge graph of patent metadata to measure knowledge proximity," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 74(4), pages 476-490, April.
    19. Wolfram, Dietmar & Zhao, Yuehua, 2014. "A comparison of journal similarity across six disciplines using citing discipline analysis," Journal of Informetrics, Elsevier, vol. 8(4), pages 840-853.
    20. Chaoqun Ni & Cassidy R. Sugimoto & Jiepu Jiang, 2013. "Venue-author-coupling: A measure for identifying disciplines through author communities," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 64(2), pages 265-279, February.

    More about this item

    Keywords

    ;
    ;
    ;
    ;

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:infome:v:4:y:2010:i:4:p:483-491. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/joi .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.