IDEAS home Printed from
   My bibliography  Save this paper

Mining Media Topics Perceived as Social Problems by Online Audiences: Use of a Data Mining Approach in Sociology


  • Oleg S. Nagornyy

    () (National Research University Higher School of Economics)

  • Olessia Y. Koltsova

    () (National Research University Higher School of Economics)


Media audiences that represent a significant part of a county’s public may hold opinions on media-generated definitions of social problems different from those of media professionals. The proliferation of user-generated content makes such opinions available, but simultaneously demands new automatic methods of analysis that media scholars still have to master. In this paper, we show how topics regarded as problematic by media consumers may be revealed and analyzed by social scientists with a combination of data mining methods. Our dataset consists of 33,877 news items and 258,121 comments from a sample of regional newspapers. With a number of new, but simple indices we find that issue salience in media texts and its popularity with audience diverge. We conclude that our approach can help communication scholars effectively detect both popular and negatively perceived topics as good proxies of social problems

Suggested Citation

  • Oleg S. Nagornyy & Olessia Y. Koltsova, 2017. "Mining Media Topics Perceived as Social Problems by Online Audiences: Use of a Data Mining Approach in Sociology," HSE Working papers WP BRP 74/SOC/2017, National Research University Higher School of Economics.
  • Handle: RePEc:hig:wpaper:74/soc/2017

    Download full text from publisher

    File URL:
    Download Restriction: no

    References listed on IDEAS

    1. Mike Thelwall & Kevan Buckley & Georgios Paltoglou & Di Cai & Arvid Kappas, 2010. "Sentiment strength detection in short informal text," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 61(12), pages 2544-2558, December.
    2. Grimmer, Justin & Stewart, Brandon M., 2013. "Text as Data: The Promise and Pitfalls of Automatic Content Analysis Methods for Political Texts," Political Analysis, Cambridge University Press, vol. 21(03), pages 267-297, June.
    3. Mike Thelwall & Kevan Buckley & Georgios Paltoglou, 2011. "Sentiment in Twitter events," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 62(2), pages 406-418, February.
    4. Kevin Wallsten, 2007. "Agenda Setting and the Blogosphere: An Analysis of the Relationship between Mainstream Media and Political Blogs," Review of Policy Research, Policy Studies Organization, vol. 24(6), pages 567-587, November.
    Full references (including those not matched with items on IDEAS)

    More about this item


    social problem; online media; topic modeling; sentiment analysis; Russia;

    JEL classification:

    • Z - Other Special Topics

    NEP fields

    This paper has been announced in the following NEP Reports:


    Access and download statistics


    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:hig:wpaper:74/soc/2017. See general information about how to correct material in RePEc.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: (Shamil Abdulaev) or (Victoria Elkina). General contact details of provider: .

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service hosted by the Research Division of the Federal Reserve Bank of St. Louis . RePEc uses bibliographic data supplied by the respective publishers.