IDEAS home Printed from https://ideas.repec.org/a/gam/jftint/v11y2019i9p200-d268742.html
   My bibliography  Save this article

Incorporating Background Checks with Sentiment Analysis to Identify Violence Risky Chinese Microblogs

Author

Listed:
  • Yun-Fei Jia

    (School of Electronic Engineering and Automation, Civil Aviation University of China, Tianjin 300300, China)

  • Shan Li

    (Honeywell Technology Solutions China, Beijing 100015, China)

  • Renbiao Wu

    (School of Electronic Engineering and Automation, Civil Aviation University of China, Tianjin 300300, China)

Abstract

Based on Web 2.0 technology, more and more people tend to express their attitude or opinions on the Internet. Radical ideas, rumors, terrorism, or violent contents are also propagated on the Internet, causing several incidents of social panic every year in China. In fact, most of this content comprises joking or emotional catharsis. To detect this with conventional techniques usually incurs a large false alarm rate. To address this problem, this paper introduces a technique that combines sentiment analysis with background checks. State-of-the-art sentiment analysis usually depends on training datasets in a specific topic area. Unfortunately, for some domains, such as violence risk speech detection, there is no definitive training data. In particular, topic-independent sentiment analysis of short Chinese text has been rarely reported in the literature. In this paper, the violence risk of the Chinese microblogs is calculated from multiple perspectives. First, a lexicon-based method is used to retrieve violence-related microblogs, and then a similarity-based method is used to extract sentiment words. Semantic rules and emoticons are employed to obtain the sentiment polarity and sentiment strength of short texts. Second, the activity risk is calculated based on the characteristics of part of speech (PoS) sequence and by semantic rules, and then a threshold is set to capture the key users. Finally, the risk is confirmed by historical speeches and the opinions of the friend-circle of the key users. The experimental results show that the proposed approach outperforms the support vector machine (SVM) method on a topic-independent corpus and can effectively reduce the false alarm rate.

Suggested Citation

  • Yun-Fei Jia & Shan Li & Renbiao Wu, 2019. "Incorporating Background Checks with Sentiment Analysis to Identify Violence Risky Chinese Microblogs," Future Internet, MDPI, vol. 11(9), pages 1-13, September.
  • Handle: RePEc:gam:jftint:v:11:y:2019:i:9:p:200-:d:268742
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/1999-5903/11/9/200/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/1999-5903/11/9/200/
    Download Restriction: no
    ---><---

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jftint:v:11:y:2019:i:9:p:200-:d:268742. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.