Monitoring rare categories in sentiment and opinion analysis: a Milan mega event on Twitter platform

My bibliography Save this article

Monitoring rare categories in sentiment and opinion analysis: a Milan mega event on Twitter platform

Author

Listed:

Anna Calissano
(Politecnico di Milano)
Simone Vantini
(Politecnico di Milano)
Marika Arena
(Politecnico di Milano)

Registered:

Abstract

This paper proposes a new aggregated classification scheme aimed to support the implementation of semantic text analysis methods in contexts characterized by the presence of rare text categories. The proposed approach starts from the aggregate supervised text classifier developed by Hopkins and King and moves forward, relying on rare event sampling methods. In detail, it enables the analyst to enlarge the number of estimated sentiment categories, both preserving the estimation accuracy and reducing the working time to unconditionally increase the size of the training set. The approach is applied to study the daily evolution of the web reputation of one of the last mega-event taking place in Europe: Expo Milano. The corpus consists of more than one million tweets in both Italian and English, discussing about the event. The analysis provides an interesting portrayal of the evolution of the Expo stakeholders’ opinions over time and allows the identification of the main drivers of the Expo reputation. The algorithm will be implemented as a running option in the next release of the R package ReadMe.

Suggested Citation

Anna Calissano & Simone Vantini & Marika Arena, 2020. "Monitoring rare categories in sentiment and opinion analysis: a Milan mega event on Twitter platform," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 29(4), pages 787-812, December.

Handle: RePEc:spr:stmapp:v:29:y:2020:i:4:d:10.1007_s10260-019-00504-7
DOI: 10.1007/s10260-019-00504-7

Download full text from publisher

As the access to this document is restricted, you may want to

for a different version of it.

References listed on IDEAS

Grimmer, Justin & Stewart, Brandon M., 2013. "Text as Data: The Promise and Pitfalls of Automatic Content Analysis Methods for Political Texts," Political Analysis, Cambridge University Press, vol. 21(3), pages 267-297, July.
Sanjiv R. Das & Mike Y. Chen, 2007. "Yahoo! for Amazon: Sentiment Extraction from Small Talk on the Web," Management Science, INFORMS, vol. 53(9), pages 1375-1388, September.
Margaret E. Roberts & Brandon M. Stewart & Edoardo M. Airoldi, 2016. "A Model of Text for Experimentation in the Social Sciences," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 111(515), pages 988-1003, July.
Martin, Lanny W. & Vanberg, Georg, 2008. "A Robust Transformation Procedure for Interpreting Political Text," Political Analysis, Cambridge University Press, vol. 16(1), pages 93-100, January.
Jonathan B. Slapin & Sven‐Oliver Proksch, 2008. "A Scaling Model for Estimating Time‐Series Party Positions from Texts," American Journal of Political Science, John Wiley & Sons, vol. 52(3), pages 705-722, July.
King, Gary & Zeng, Langche, 2001. "Logistic Regression in Rare Events Data," Political Analysis, Cambridge University Press, vol. 9(2), pages 137-163, January.
Laver, Michael & Benoit, Kenneth & Garry, John, 2003. "Extracting Policy Positions from Political Texts Using Words as Data," American Political Science Review, Cambridge University Press, vol. 97(2), pages 311-331, May.
Daniel J. Hopkins & Gary King, 2010. "A Method of Automated Nonparametric Content Analysis for Social Science," American Journal of Political Science, John Wiley & Sons, vol. 54(1), pages 229-247, January.
Lowe, Will, 2008. "Understanding Wordscores," Political Analysis, Cambridge University Press, vol. 16(4), pages 356-371.
Scott Deerwester & Susan T. Dumais & George W. Furnas & Thomas K. Landauer & Richard Harshman, 1990. "Indexing by latent semantic analysis," Journal of the American Society for Information Science, Association for Information Science & Technology, vol. 41(6), pages 391-407, September.
Michael Salter-Townshend & Thomas Murphy, 2014. "Mixtures of biased sentiment analysers," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 8(1), pages 85-103, March.

Full references (including those not matched with items on IDEAS)

Most related items

These are the items that most often cite the same works as this one and are cited by the same works as this one.

Pierre-Marc Daigneault & Dominic Duval & Louis M. Imbeau, 2018. "Supervised scaling of semi-structured interview transcripts to characterize the ideology of a social policy reform," Quality & Quantity: International Journal of Methodology, Springer, vol. 52(5), pages 2151-2162, September.
Martin Haselmayer & Marcelo Jenny, 2017. "Sentiment analysis of political communication: combining a dictionary approach with crowdcoding," Quality & Quantity: International Journal of Methodology, Springer, vol. 51(6), pages 2623-2646, November.
van Loon, Austin, 2022. "Three Families of Automated Text Analysis," SocArXiv htnej, Center for Open Science.
Alastair Langtry & Niklas Potrafke & Marcel Schlepper & Timo Wochner, 2024. "Gambling for Re-election," CESifo Working Paper Series 11125, CESifo.
Pongsak Luangaram & Yuthana Sethapramote, 2016. "Central Bank Communication and Monetary Policy Effectiveness: Evidence from Thailand," PIER Discussion Papers 20, Puey Ungphakorn Institute for Economic Research.
Hanna Bäck & Marc Debus & Wolfgang C. Müller, 2016. "Intra-party diversity and ministerial selection in coalition governments," Public Choice, Springer, vol. 166(3), pages 355-378, March.
Diaf, Sami & Döpke, Jörg & Fritsche, Ulrich & Rockenbach, Ida, 2022. "Sharks and minnows in a shoal of words: Measuring latent ideological positions based on text mining techniques," European Journal of Political Economy, Elsevier, vol. 75(C).
Adriana Bunea & Raimondas Ibenskas, 2015. "Quantitative text analysis and the study of EU lobbying and interest groups," European Union Politics, , vol. 16(3), pages 429-455, September.
Rebecca Cordell & Kristian Skrede Gleditsch & Florian G Kern & Laura Saavedra-Lux, 2020. "Measuring institutional variation across American Indian constitutions using automated content analysis," Journal of Peace Research, Peace Research Institute Oslo, vol. 57(6), pages 777-788, November.
Feldkircher, Martin & Hofmarcher, Paul & Siklos, Pierre L., 2024. "One money, one voice? Evaluating ideological positions of euro area central banks," European Journal of Political Economy, Elsevier, vol. 85(C).
Caroline Le Pennec, 2024. "Strategic Campaign Communication: Evidence from 30,000 Candidate Manifestos," The Economic Journal, Royal Economic Society, vol. 134(658), pages 785-810.
- Caroline Le Pennec, 2020. "Strategic Campaign Communication: Evidence from 30,000 Candidate Manifestos," SoDa Laboratories Working Paper Series 2020-05, Monash University, SoDa Laboratories.
Auffenberg, Jennie & Marcinkiewicz, Kamil, 2013. "Wer gestaltet, wer verwaltet Reformen im öffentlichen Dienst? Ein Methodenvergleich zur Analyse von Arbeitsbeziehungen in Reformprozessen anhand der Polizei Brandenburg," TranState Working Papers 170, University of Bremen, Collaborative Research Center 597: Transformations of the State.
Greene, Zac & Ceron, Andrea & Schumacher, Gijs & Fazekas, Zoltan, 2016. "The Nuts and Bolts of Automated Text Analysis. Comparing Different Document Pre-Processing Techniques in Four Countries," OSF Preprints ghxj8, Center for Open Science.
Born, Andreas & Janssen, Aljoscha, 2020. "Does a District-Vote Matter for the Behavior of Politicians? A Textual Analysis of Parliamentary Speeches," Working Paper Series 1320, Research Institute of Industrial Economics.
Weifeng Zhong, 2016. "The candidates in their own words: A textual analysis of 2016 president primary debates," AEI Economic Perspectives, American Enterprise Institute, April.
Emile du Plessis, 2025. "Can Text-Based Statistical Models Reveal Impending Banking Crises?," Computational Economics, Springer;Society for Computational Economics, vol. 65(3), pages 1265-1298, March.
Marc Debus, 2009. "Pre-electoral commitments and government formation," Public Choice, Springer, vol. 138(1), pages 45-64, January.
Matthew Gentzkow & Bryan T. Kelly & Matt Taddy, 2017. "Text as Data," NBER Working Papers 23276, National Bureau of Economic Research, Inc.
Andres Algaba & David Ardia & Keven Bluteau & Samuel Borms & Kris Boudt, 2020. "Econometrics Meets Sentiment: An Overview Of Methodology And Applications," Journal of Economic Surveys, Wiley Blackwell, vol. 34(3), pages 512-547, July.

More about this item

Keywords

; ; ; ; ; ;

Statistics

Access and download statistics

Corrections

All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:stmapp:v:29:y:2020:i:4:d:10.1007_s10260-019-00504-7. See general information about how to correct material in RePEc.

If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

Please note that corrections may take a couple of weeks to filter through the various RePEc services.

IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.

Browse Econ Literature

More features

Monitoring rare categories in sentiment and opinion analysis: a Milan mega event on Twitter platform

Author

Abstract

Suggested Citation

Download full text from publisher

References listed on IDEAS

Most related items

More about this item

Keywords

Statistics

Corrections

More services and features

MyIDEAS

Author registration

Rankings

RePEc Genealogy

RePEc Biblio

MPRA

New papers by email

EconAcademics

Plagiarism

About RePEc

RePEc home

Blog

Help/FAQ

RePEc team

Participating archives

Privacy statement

Help us

Corrections

Volunteers

Get papers listed

Open a RePEc archive

Get RePEc data