IDEAS home Printed from https://ideas.repec.org/p/zbw/irtgdp/2018054.html
   My bibliography  Save this paper

Topic Modeling for Analyzing Open-Ended Survey Responses

Author

Listed:
  • Pietsch, Andra-Selina
  • Lessmann, Stefan

Abstract

Open-ended responses are widely used in market research studies. Processing of such responses requires labor-intensive human coding. This paper focuses on unsupervised topic models and tests their ability to automate the analysis of open-ended responses. Since state-of-the-art topic models struggle with the shortness of open-ended responses, the paper considers three novel short text topic models: Latent Feature Latent Dirichlet Allocation, Biterm Topic Model and Word Network Topic Model. The models are fitted and evaluated on a set of realworld open-ended responses provided by a market research company. Multiple components such as topic coherence and document classification are quantitatively and qualitatively evaluated to appraise whether topic models can replace human coding. The results suggest that topic models are a viable alternative for open-ended response coding. However, their usefulness is limited when a correct one-to-one mapping of responses and topics or the exact topic distribution is needed.

Suggested Citation

  • Pietsch, Andra-Selina & Lessmann, Stefan, 2018. "Topic Modeling for Analyzing Open-Ended Survey Responses," IRTG 1792 Discussion Papers 2018-054, Humboldt University of Berlin, International Research Training Group 1792 "High Dimensional Nonstationary Time Series".
  • Handle: RePEc:zbw:irtgdp:2018054
    as

    Download full text from publisher

    File URL: https://www.econstor.eu/bitstream/10419/230765/1/irtg1792dp2018-054.pdf
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Grün, Bettina & Hornik, Kurt, 2011. "topicmodels: An R Package for Fitting Topic Models," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 40(i13).
    2. Margaret E. Roberts & Brandon M. Stewart & Dustin Tingley & Christopher Lucas & Jetson Leder‐Luis & Shana Kushner Gadarian & Bethany Albertson & David G. Rand, 2014. "Structural Topic Models for Open‐Ended Survey Responses," American Journal of Political Science, John Wiley & Sons, vol. 58(4), pages 1064-1082, October.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Gaurav, Kumar & Ghosh, Sayantari & Bhattacharya, Saumik & Singh, Yatindra Nath, 2019. "Ensuring the Spread of Referral Marketing Campaigns: A Quantitative Treatment," SocArXiv 6spnr, Center for Open Science.
    2. repec:osf:socarx:6spnr_v1 is not listed on IDEAS
    3. Ziwen Liu & Scott Allan Orr & Pakhee Kumar & Josep Grau-Bove, 2023. "Measuring the impact of COVID-19 on heritage sites in the UK using social media data," Palgrave Communications, Palgrave Macmillan, vol. 10(1), pages 1-13, December.
    4. Yatracos, Yannis G., 2018. "Residual'S Influence Index (Rinfin), Bad Leverage And Unmasking In High Dimensional L2-Regression," IRTG 1792 Discussion Papers 2018-060, Humboldt University of Berlin, International Research Training Group 1792 "High Dimensional Nonstationary Time Series".
    5. Evgeny Nikulchev & Dmitry Ilin & Anastasiya Silaeva & Pavel Kolyasnikov & Vladimir Belov & Andrey Runtov & Pavel Pushkin & Nikolay Laptev & Anna Alexeenko & Shamil Magomedov & Alexander Kosenkov & Ily, 2020. "Digital Psychological Platform for Mass Web-Surveys," Data, MDPI, vol. 5(4), pages 1-16, October.
    6. Tobias Wekhof & Sébastien Houde, 2023. "Using narratives to infer preferences in understanding the energy efficiency gap," Nature Energy, Nature, vol. 8(9), pages 965-977, September.
    7. Valter Martins Vairinhos & Luís Agonia Pereira & Florinda Matos & Helena Nunes & Carmen Patino & Purificación Galindo-Villardón, 2022. "Framework for Classroom Student Grading with Open-Ended Questions: A Text-Mining Approach," Mathematics, MDPI, vol. 10(21), pages 1-20, November.
    8. Yen, Ju-Chun & Wang, Tawei, 2021. "Stock price relevance of voluntary disclosures about blockchain technology and cryptocurrencies," International Journal of Accounting Information Systems, Elsevier, vol. 40(C).

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Ulrich Fritsche & Johannes Puckelwald, 2018. "Deciphering Professional Forecasters’ Stories - Analyzing a Corpus of Textual Predictions for the German Economy," Macroeconomics and Finance Series 201804, University of Hamburg, Department of Socioeconomics.
    2. Sanders, James & Lisi, Giulio & Schonhardt-Bailey, Cheryl, 2018. "Themes and topics in parliamentary oversight hearings: a new direction in textual data analysis," LSE Research Online Documents on Economics 87624, London School of Economics and Political Science, LSE Library.
    3. Sandra Wankmüller, 2023. "A comparison of approaches for imbalanced classification problems in the context of retrieving relevant documents for an analysis," Journal of Computational Social Science, Springer, vol. 6(1), pages 91-163, April.
    4. Savin, Ivan & Ott, Ingrid & Konop, Chris, 2022. "Tracing the evolution of service robotics: Insights from a topic modeling approach," Technological Forecasting and Social Change, Elsevier, vol. 174(C).
    5. Imran Ali & Devika Kannan, 2022. "Mapping research on healthcare operations and supply chain management: a topic modelling-based literature review," Annals of Operations Research, Springer, vol. 315(1), pages 29-55, August.
    6. Sumeet Sahay & Hemant Kumar Kaushik & Shikha Singh, 2023. "Discovering themes and trends in electricity supply chain area research," OPSEARCH, Springer;Operational Research Society of India, vol. 60(3), pages 1525-1560, September.
    7. Everett, Jeff & Shiraz Rahaman, Abu & Neu, Dean & Saxton, Gregory, 2024. "Letters to the editor, institutional experimentation, and the public accounting professional," CRITICAL PERSPECTIVES ON ACCOUNTING, Elsevier, vol. 99(C).
    8. Minchul Lee & Min Song, 2020. "Incorporating citation impact into analysis of research trends," Scientometrics, Springer;Akadémiai Kiadó, vol. 124(2), pages 1191-1224, August.
    9. Marcel Fratzscher & Tobias Heidland & Lukas Menkhoff & Lucio Sarno & Maik Schmeling, 2023. "Foreign Exchange Intervention: A New Database," IMF Economic Review, Palgrave Macmillan;International Monetary Fund, vol. 71(4), pages 852-884, December.
    10. Maksym Polyakov & Morteza Chalak & Md. Sayed Iftekhar & Ram Pandit & Sorada Tapsuwan & Fan Zhang & Chunbo Ma, 2018. "Authorship, Collaboration, Topics, and Research Gaps in Environmental and Resource Economics 1991–2015," Environmental & Resource Economics, Springer;European Association of Environmental and Resource Economists, vol. 71(1), pages 217-239, September.
    11. Li Tang & Jennifer Kuzma & Xi Zhang & Xinyu Song & Yin Li & Hongxu Liu & Guangyuan Hu, 2023. "Synthetic biology and governance research in China: a 40-year evolution," Scientometrics, Springer;Akadémiai Kiadó, vol. 128(9), pages 5293-5310, September.
    12. Martin Baumgaertner & Johannes Zahner, 2021. "Whatever it takes to understand a central banker - Embedding their words using neural networks," MAGKS Papers on Economics 202130, Philipps-Universität Marburg, Faculty of Business Administration and Economics, Department of Economics (Volkswirtschaftliche Abteilung).
    13. Kübler, Raoul V. & Manke, Kai & Pauwels, Koen, 2025. "I like, I share, I vote: Mapping the dynamic system of political marketing," Journal of Business Research, Elsevier, vol. 186(C).
    14. Daoud, Adel & Kohl, Sebastian, 2016. "How much do sociologists write about economic topics? Using big data to test some conventional views in economic sociology, 1890 to 2014," MPIfG Discussion Paper 16/7, Max Planck Institute for the Study of Societies.
    15. Cardinale, Roberto & Cardinale, Ivano & Zupic, Ivan, 2024. "The EU's vulnerability to gas price and supply shocks: The role of mismatches between policy beliefs and changing international gas markets," Energy Economics, Elsevier, vol. 131(C).
    16. Shr-Wei Kao & Pin Luarn, 2020. "Topic Modeling Analysis of Social Enterprises: Twitter Evidence," Sustainability, MDPI, vol. 12(8), pages 1-20, April.
    17. Savin, Ivan & Drews, Stefan & van den Bergh, Jeroen, 2021. "Free associations of citizens and scientists with economic and green growth: A computational-linguistics analysis," Ecological Economics, Elsevier, vol. 180(C).
    18. Hsia-Ching Chang, 2016. "The Synergy of Scientometric Analysis and Knowledge Mapping with Topic Models: Modelling the Development Trajectories of Information Security and Cyber-Security Research," Journal of Information & Knowledge Management (JIKM), World Scientific Publishing Co. Pte. Ltd., vol. 15(04), pages 1-33, December.
    19. Vishnu Baburajan & Jo~ao de Abreu e Silva & Francisco Camara Pereira, 2022. "Open vs Closed-ended questions in attitudinal surveys -- comparing, combining, and interpreting using natural language processing," Papers 2205.01317, arXiv.org.
    20. Susumu Nagayama & Hitoshi Mitsuhashi, 2022. "Explosive and implosive root concepts: An analysis of music moods rooted by two influential rap artists," PLOS ONE, Public Library of Science, vol. 17(7), pages 1-25, July.

    More about this item

    Keywords

    ;
    ;
    ;
    ;

    JEL classification:

    • C00 - Mathematical and Quantitative Methods - - General - - - General

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:zbw:irtgdp:2018054. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ZBW - Leibniz Information Centre for Economics (email available below). General contact details of provider: https://edirc.repec.org/data/wfhubde.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.