IDEAS home Printed from https://ideas.repec.org/a/gam/jsusta/v14y2022i7p4053-d782341.html
   My bibliography  Save this article

Exploration of Topic Classification in the Tourism Field with Text Mining Technology—A Case Study of the Academic Journal Papers

Author

Listed:
  • I-Cheng Chang

    (Department of Environmental Engineering, National Ilan University, Yilan 260007, Taiwan)

  • Jeou-Shyan Horng

    (Department of Food and Beverage, Shih Chien University, Taipei 104336, Taiwan)

  • Chih-Hsing Liu

    (Department of Tourism Management, National Kaohsiung University of Science and Technology, Kaohsiung 811532, Taiwan)

  • Sheng-Fang Chou

    (Department of Hospitality Management, Ming Chuan University, Taoyuan 333321, Taiwan)

  • Tai-Yi Yu

    (Department of Risk Management and Insurance, Ming Chuan University, Taipei 111005, Taiwan)

Abstract

This study collects abstracts of SSCI tourism journal papers between 2010 and 2019 from the WoS (Web of Science) database and uses a novel method of topic classification to explore the vocabulary characteristics of the classified articles. The corpora of abstracts are given quantitative Term Frequency–Inverse Document Frequency (TF–IDF) weights. A hierarchical K-means cluster analysis is then performed to automatically classify the articles; co-word analysis techniques are used to show the characteristics of feature words for distinct clusters, titles, and the consistency of the classified articles. Based on the results for 5783 abstracts, cluster analysis classifies the number of K-means clusters into six categories: travel, culture, sustainability, model, behavior, and hotel. A cross-check method is applied to assess the consistency of the topic classifications, list titles and keywords of the documents with the three smallest distances in each category and apply a strategic diagram to present the features of the distinct categories.

Suggested Citation

  • I-Cheng Chang & Jeou-Shyan Horng & Chih-Hsing Liu & Sheng-Fang Chou & Tai-Yi Yu, 2022. "Exploration of Topic Classification in the Tourism Field with Text Mining Technology—A Case Study of the Academic Journal Papers," Sustainability, MDPI, vol. 14(7), pages 1-21, March.
  • Handle: RePEc:gam:jsusta:v:14:y:2022:i:7:p:4053-:d:782341
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2071-1050/14/7/4053/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2071-1050/14/7/4053/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Dudensing, Rebekka M. & Hughes, David W. & Shields, Martin, 2011. "Perceptions of tourism promotion and business challenges: A survey-based comparison of tourism businesses and promotion organizations," Tourism Management, Elsevier, vol. 32(6), pages 1453-1462.
    2. Ian Sutherland & Kiattipoom Kiatkawsin, 2020. "Determinants of Guest Experience in Airbnb: A Topic Modeling Approach Using LDA," Sustainability, MDPI, vol. 12(8), pages 1-16, April.
    3. Ying Yang & Mingzhi Wu & Lei Cui, 2012. "Integration of three visualization methods based on co-word analysis," Scientometrics, Springer;Akadémiai Kiadó, vol. 90(2), pages 659-673, February.
    4. Mercedes Jiménez-García & José Ruiz-Chico & Antonio Rafael Peña-Sánchez & José Antonio López-Sánchez, 2020. "A Bibliometric Analysis of Sports Tourism and Sustainability (2002–2019)," Sustainability, MDPI, vol. 12(7), pages 1-18, April.
    5. Angels Niñerola & Maria-Victòria Sánchez-Rebull & Ana-Beatriz Hernández-Lara, 2019. "Tourism Research on Sustainability: A Bibliometric Analysis," Sustainability, MDPI, vol. 11(5), pages 1-17, March.
    6. Daoyan Guo & Hong Chen & Ruyin Long & Hui Lu & Qianyi Long, 2017. "A Co-Word Analysis of Organizational Constraints for Maintaining Sustainability," Sustainability, MDPI, vol. 9(10), pages 1-19, October.
    7. Nuria Rodríguez-López & M. Isabel Diéguez-Castrillón & Ana Gueimonde-Canto, 2019. "Sustainability and Tourism Competitiveness in Protected Areas: State of Art and Future Lines of Research," Sustainability, MDPI, vol. 11(22), pages 1-32, November.
    8. Christina Katsikari & Leonidas Hatzithomas & Thomas Fotiadis & Dimitrios Folinas, 2020. "Push and Pull Travel Motivation: Segmentation of the Greek Market for Social Media Marketing in Tourism," Sustainability, MDPI, vol. 12(11), pages 1-18, June.
    9. Xiang, Zheng & Du, Qianzhou & Ma, Yufeng & Fan, Weiguo, 2017. "A comparative analysis of major online review platforms: Implications for social media analytics in hospitality and tourism," Tourism Management, Elsevier, vol. 58(C), pages 51-65.
    10. Susan (Sixue) Jia, 2018. "Leisure Motivation and Satisfaction: A Text Mining of Yoga Centres, Yoga Consumers, and Their Interactions," Sustainability, MDPI, vol. 10(12), pages 1-17, November.
    11. de la Hoz-Correa, Andrea & Muñoz-Leiva, Francisco & Bakucz, Márta, 2018. "Past themes and future trends in medical tourism research: A co-word analysis," Tourism Management, Elsevier, vol. 65(C), pages 200-211.
    12. Cobo, M.J. & López-Herrera, A.G. & Herrera-Viedma, E. & Herrera, F., 2011. "An approach for detecting, quantifying, and visualizing the evolution of a research field: A practical application to the Fuzzy Sets Theory field," Journal of Informetrics, Elsevier, vol. 5(1), pages 146-166.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Nala Alahmari & Rashid Mehmood & Ahmed Alzahrani & Tan Yigitcanlar & Juan M. Corchado, 2023. "Autonomous and Sustainable Service Economies: Data-Driven Optimization of Design and Operations through Discovery of Multi-Perspective Parameters," Sustainability, MDPI, vol. 15(22), pages 1-44, November.
    2. Barbara Mazza, 2023. "A Theoretical Model of Strategic Communication for the Sustainable Development of Sport Tourism," Sustainability, MDPI, vol. 15(9), pages 1-19, April.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Weisheng Chiu & Thomas Chun Man Fan & Sang-Back Nam & Ping-Hung Sun, 2021. "Knowledge Mapping and Sustainable Development of eSports Research: A Bibliometric and Visualized Analysis," Sustainability, MDPI, vol. 13(18), pages 1-17, September.
    2. Yucheng Zhang & Zhiling Wang & Lin Xiao & Lijun Wang & Pei Huang, 2023. "Discovering the evolution of online reviews: A bibliometric review," Electronic Markets, Springer;IIM University of St. Gallen, vol. 33(1), pages 1-22, December.
    3. Batista-Canino, Rosa M. & Santana-Hernández, Lidia & Medina-Brito, Pino, 2024. "A holistic literature review on entrepreneurial Intention: A scientometric approach," Journal of Business Research, Elsevier, vol. 174(C).
    4. Paúl Carrión-Mero & Néstor Montalván-Burbano & Fernando Morante-Carballo & Adolfo Quesada-Román & Boris Apolo-Masache, 2021. "Worldwide Research Trends in Landslide Science," IJERPH, MDPI, vol. 18(18), pages 1-24, September.
    5. Susan (Sixue) Jia, 2021. "Analyzing Restaurant Customers’ Evolution of Dining Patterns and Satisfaction during COVID-19 for Sustainable Business Insights," Sustainability, MDPI, vol. 13(9), pages 1-15, April.
    6. Oscar Morell-Santandreu & Cristina Santandreu-Mascarell & Julio García-Sabater, 2020. "Sustainability and Kaizen: Business Model Trends in Healthcare," Sustainability, MDPI, vol. 12(24), pages 1-28, December.
    7. Wang, Chao & Lim, Ming K & Zhao, Longfeng & Tseng, Ming-Lang & Chien, Chen-Fu & Lev, Benjamin, 2020. "The evolution of Omega-The International Journal of Management Science over the past 40 years: A bibliometric overview," Omega, Elsevier, vol. 93(C).
    8. Labib, Tahmid & Mustafa, Saadman Sakib & Khan, Abdul Mahidud, 2022. "Bibliometric Analysis on Tourism in Bangladesh," MPRA Paper 117365, University Library of Munich, Germany.
    9. Shome, Samik & Hassan, M. Kabir & Verma, Sushma & Panigrahi, Tushar Ranjan, 2023. "Impact investment for sustainable development: A bibliometric analysis," International Review of Economics & Finance, Elsevier, vol. 84(C), pages 770-800.
    10. Barbara Mazza, 2023. "A Theoretical Model of Strategic Communication for the Sustainable Development of Sport Tourism," Sustainability, MDPI, vol. 15(9), pages 1-19, April.
    11. Seyedmohammadreza Hosseini & Hamed Baziyad & Rasoul Norouzi & Sheida Jabbedari Khiabani & Győző Gidófalvi & Amir Albadvi & Abbas Alimohammadi & Seyedehsan Seyedabrishami, 2021. "Mapping the intellectual structure of GIS-T field (2008–2019): a dynamic co-word analysis," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(4), pages 2667-2688, April.
    12. Mostafa, Mohamed M., 2022. "Five decades of catastrophe theory research: Geographical atlas, knowledge structure and historical roots," Chaos, Solitons & Fractals, Elsevier, vol. 159(C).
    13. Matteo Lascialfari & Marie-Benoît Magrini & Guillaume Cabanac, 2022. "Unpacking research lock-in through a diachronic analysis of topic cluster trajectories in scholarly publications," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(11), pages 6165-6189, November.
    14. Seungju Nam & Hyun Cheol Lee, 2019. "A Text Analytics-Based Importance Performance Analysis and Its Application to Airline Service," Sustainability, MDPI, vol. 11(21), pages 1-24, November.
    15. Luis Miguel López-Bonilla & María del Carmen Reyes-Rodríguez & Jesús Manuel López-Bonilla, 2020. "Golf Tourism and Sustainability: Content Analysis and Directions for Future Research," Sustainability, MDPI, vol. 12(9), pages 1-18, April.
    16. Ramon Saura, Jose & Reyes-Menendez, Ana & Palos-Sanchez, Pedro & Filipe, Ferrão, 2019. "Discovering Ugc Communities To Drive Marketing Strategies: Leveraging Data Visualization," Journal of Tourism, Sustainability and Well-being, Cinturs - Research Centre for Tourism, Sustainability and Well-being, University of Algarve, vol. 7(3), pages 261-272.
    17. Muñoz Leiva, Francisco & Rodríguez López, María Eugenia & García Martí, Bárbara, 2022. "Discovering prominent themes of the application of eye tracking technology in marketing research," Cuadernos de Gestión, Universidad del País Vasco - Instituto de Economía Aplicada a la Empresa (IEAE).
    18. Cristina Bernini & Silvia Emili & Laura Vici, 2021. "Are mass tourists sensitive to sustainability?," Tourism Economics, , vol. 27(7), pages 1375-1397, November.
    19. Alina-Cerasela Aluculesei & Puiu Nistoreanu & Daniel Avram & Bogdan Gabriel Nistoreanu, 2021. "Past and Future Trends in Medical Spas: A Co-Word Analysis," Sustainability, MDPI, vol. 13(17), pages 1-20, August.
    20. Deming Lin & Tianhui Gong & Wenbin Liu & Martin Meyer, 2020. "An entropy-based measure for the evolution of h index research," Scientometrics, Springer;Akadémiai Kiadó, vol. 125(3), pages 2283-2298, December.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jsusta:v:14:y:2022:i:7:p:4053-:d:782341. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.