IDEAS home Printed from https://ideas.repec.org/a/igg/jswis0/v17y2021i3p18-34.html
   My bibliography  Save this article

Prospecting the Effect of Topic Modeling in Information Retrieval

Author

Listed:
  • Aakanksha Sharaff

    (National Institute of Technology, Raipur, India)

  • Jitesh Kumar Dewangan

    (Samsung Research Institute, Noida, India)

  • Dilip Singh Sisodia

    (National Institute of Technology, Raipur, India)

Abstract

Enormous records and data are gathered every day. Organization of this data is a challenging task. Topic modeling provides a way to categorize these documents, where high dimensionality of the corpus affects the result of topic model, making it important to apply feature selection or information retrieval process for dimensionality reduction. The requirement for efficient topic modeling includes the removal of unrelated words that might lead to specious coexistence of the unrelated words. This paper proposes an efficient framework for the generation of better topic coherence, where term frequency-inverse document frequency (TF-IDF) and parsimonious language model (PLM) are used for the information retrieval task. PLM extracts the important information and expels the general words from the corpus, whereas TF-IDF re-estimates the weightage of each word in the corpus. The work carried out in this paper improved the topic coherence measure to provide a better correlation among the actual topic and the topics generated from PLM.

Suggested Citation

  • Aakanksha Sharaff & Jitesh Kumar Dewangan & Dilip Singh Sisodia, 2021. "Prospecting the Effect of Topic Modeling in Information Retrieval," International Journal on Semantic Web and Information Systems (IJSWIS), IGI Global, vol. 17(3), pages 18-34, July.
  • Handle: RePEc:igg:jswis0:v:17:y:2021:i:3:p:18-34
    as

    Download full text from publisher

    File URL: http://services.igi-global.com/resolvedoi/resolve.aspx?doi=10.4018/IJSWIS.2021070102
    Download Restriction: no
    ---><---

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:igg:jswis0:v:17:y:2021:i:3:p:18-34. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Journal Editor (email available below). General contact details of provider: https://www.igi-global.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.