IDEAS home Printed from https://ideas.repec.org/h/spr/lnichp/978-3-031-32418-5_12.html
   My bibliography  Save this book chapter

Topic Classification for Short Texts

In: Advances in Information Systems Development

Author

Listed:
  • Dan Claudiu Neagu

    (Cicada Technologies
    Babes-Bolyai University)

  • Andrei Bogdan Rus

    (Cicada Technologies
    Technical University)

  • Mihai Grec

    (Cicada Technologies)

  • Mihai Boroianu

    (Cicada Technologies)

  • Gheorghe Cosmin Silaghi

    (Babes-Bolyai University)

Abstract

In the context of TV and social media surveillance, constructing models to automate topic identification of short texts is a key task. This paper constructs worth-to-consider models for practical usage, employing Top-K multinomial classification methodology. We describe the full data processing pipeline, discussing about dataset selection, text preprocessing, feature extraction, model selection and learning, including hyperparameter optimization. We will test and compare popular methods including: standard machine learning, deep learning, and a fine-tuned BERT for topic classification.

Suggested Citation

  • Dan Claudiu Neagu & Andrei Bogdan Rus & Mihai Grec & Mihai Boroianu & Gheorghe Cosmin Silaghi, 2023. "Topic Classification for Short Texts," Lecture Notes in Information Systems and Organization, in: Gheorghe Cosmin Silaghi & Robert Andrei Buchmann & Virginia Niculescu & Gabriela Czibula & Chris Bar (ed.), Advances in Information Systems Development, pages 207-222, Springer.
  • Handle: RePEc:spr:lnichp:978-3-031-32418-5_12
    DOI: 10.1007/978-3-031-32418-5_12
    as

    Download full text from publisher

    To our knowledge, this item is not available for download. To find whether it is available, there are three options:
    1. Check below whether another version of this item is available online.
    2. Check on the provider's web page whether it is in fact available.
    3. Perform a search for a similarly titled item that would be available.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:lnichp:978-3-031-32418-5_12. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.