IDEAS home Printed from https://ideas.repec.org/a/zna/indecs/v21y2023i6p607-622.html
   My bibliography  Save this article

An Example of the Consistency Analysis of the Classification of Textual Materials by the Analyst and using the Naïve Bayesian Classifier

Author

Listed:
  • Josip Jezovita

    (Catholic University of Croatia, Zagreb, Croatia)

  • Mateja Plenkovic

    (Catholic University of Croatia, Zagreb, Croatia)

  • Nika Djuho

    (Catholic University of Croatia, Zagreb, Croatia)

Abstract

Sentiment analysis is a particular form of content analysis, and its application has become popular with the growth of Internet platforms where a wide range of content is generated. Today, various classifiers use for sentiment analysis, and in this article, we show an example of using a Naïve Bayesian classifier. The aim is to examine the consistency of classifying textual materials into a positive, negative or neutral tone by analysts and the Bayesian algorithm. The hypotheses are that there is an increase in the agreement between the two ways of classifying textual materials as (1) the complexity of the formulations and (2) the size of the learning datasets increases. Based on the results, both hypotheses were accepted, but only on certain groups of messages. Increasing the size of the learning datasets and increasing the complexity of the formulations helped the classification accuracy for messages in a positive tone, while the classification accuracy for messages in other tones was high and equal regardless of varying the parameters. Correlation analysis showed a high positive correlation between the outcomes the Bayesian algorithm classified and the tones the analyst determined (r = 0,816).

Suggested Citation

  • Josip Jezovita & Mateja Plenkovic & Nika Djuho, 2023. "An Example of the Consistency Analysis of the Classification of Textual Materials by the Analyst and using the Naïve Bayesian Classifier," Interdisciplinary Description of Complex Systems - scientific journal, Croatian Interdisciplinary Society Provider Homepage: http://indecs.eu, vol. 21(6), pages 607-622.
  • Handle: RePEc:zna:indecs:v:21:y:2023:i:6:p:607-622
    as

    Download full text from publisher

    File URL: https://www.indecs.eu/2023/indecs2023-pp607-622.pdf
    Download Restriction: no
    ---><---

    More about this item

    Keywords

    content analysis; sentiment analysis; naïve Bayes classifier;
    All these keywords.

    JEL classification:

    • C38 - Mathematical and Quantitative Methods - - Multiple or Simultaneous Equation Models; Multiple Variables - - - Classification Methdos; Cluster Analysis; Principal Components; Factor Analysis

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:zna:indecs:v:21:y:2023:i:6:p:607-622. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Josip Stepanic (email available below). General contact details of provider: .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.