IDEAS home Printed from https://ideas.repec.org/a/eee/infome/v14y2020i3s1751157719303037.html
   My bibliography  Save this article

Words ranking and Hirsch index for identifying the core of the hapaxes in political texts

Author

Listed:
  • Ficcadenti, Valerio
  • Cerqueti, Roy
  • Ausloos, Marcel
  • Dhesi, Gurjeet

Abstract

This paper deals with a quantitative analysis of the content of official political speeches. We study a set of about one thousand talks pronounced by the US Presidents, ranging from Washington to Trump. In particular, we search for the relevance of the rare words, i.e. those said only once in each speech – the so-called hapaxes. We implement a rank-size procedure of Zipf–Mandelbrot type for discussing the hapaxes’ frequencies regularity over the overall set of speeches. Starting from the obtained rank-size law, we define and detect the core of the hapaxes set by means of a procedure based on an Hirsch index variant. We discuss the resulting list of words in the light of the overall US Presidents’ speeches. We further show that this core of hapaxes itself can be well fitted through a Zipf–Mandelbrot law and that contains elements producing deviations at the low ranks between scatter plots and fitted curve – the so-called king and vice-roy effect. Some socio-political insights are derived from the obtained findings about the US Presidents messages.

Suggested Citation

  • Ficcadenti, Valerio & Cerqueti, Roy & Ausloos, Marcel & Dhesi, Gurjeet, 2020. "Words ranking and Hirsch index for identifying the core of the hapaxes in political texts," Journal of Informetrics, Elsevier, vol. 14(3).
  • Handle: RePEc:eee:infome:v:14:y:2020:i:3:s1751157719303037
    DOI: 10.1016/j.joi.2020.101054
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S1751157719303037
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.joi.2020.101054?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Papadimitriou, C. & Karamanos, K. & Diakonos, F.K. & Constantoudis, V. & Papageorgiou, H., 2010. "Entropy analysis of natural language written texts," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 389(16), pages 3260-3266.
    2. J. E. Hirsch, 2010. "An index to quantify an individual’s scientific research output that takes into account the effect of multiple coauthorship," Scientometrics, Springer;Akadémiai Kiadó, vol. 85(3), pages 741-754, December.
    3. Guns, Raf & Rousseau, Ronald, 2009. "Real and rational variants of the h-index and the g-index," Journal of Informetrics, Elsevier, vol. 3(1), pages 64-71.
    4. Lambiotte, R. & Ausloos, M. & Thelwall, M., 2007. "Word statistics in Blogs and RSS feeds: Towards empirical universal evidence," Journal of Informetrics, Elsevier, vol. 1(4), pages 277-286.
    5. Ausloos, M., 2012. "Measuring complexity with multifractals in texts. Translation effects," Chaos, Solitons & Fractals, Elsevier, vol. 45(11), pages 1349-1357.
    6. Schreiber, Michael, 2010. "A new family of old Hirsch index variants," Journal of Informetrics, Elsevier, vol. 4(4), pages 647-651.
    7. Marcel Ausloos & Roy Cerqueti, 2016. "A Universal Rank-Size Law," PLOS ONE, Public Library of Science, vol. 11(11), pages 1-15, November.
    8. Ausloos, M., 2010. "Punctuation effects in english and esperanto texts," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 389(14), pages 2835-2840.
    9. M. Ausloos, 2013. "A scientometrics law about co-authors and their ranking: the co-author core," Scientometrics, Springer;Akadémiai Kiadó, vol. 95(3), pages 895-909, June.
    10. Cuiqing Jiang & Zhao Wang & Ruiya Wang & Yong Ding, 2018. "Loan default prediction by combining soft information extracted from descriptive text in online peer-to-peer lending," Annals of Operations Research, Springer, vol. 266(1), pages 511-529, July.
    11. Ausloos, Marcel, 2015. "Coherent measures of the impact of co-authors in peer review journals and in proceedings publications," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 438(C), pages 568-578.
    12. Cerqueti, Roy & Ausloos, Marcel, 2015. "Evidence of economic regularities and disparities of Italian regions from aggregated tax income size data," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 421(C), pages 187-207.
    13. Ausloos, M., 2008. "Equilibrium and dynamic methods when comparing an English text and its Esperanto translation," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 387(25), pages 6411-6420.
    14. James Cochran & David Curry & Rajesh Radhakrishnan & Jon Pinnell, 2014. "Political engineering: optimizing a U.S. Presidential candidate’s platform," Annals of Operations Research, Springer, vol. 215(1), pages 63-87, April.
    15. Yoon, Hyui Geon & Kim, Hyungjun & Kim, Chang Ouk & Song, Min, 2016. "Opinion polarity detection in Twitter data combining shrinkage regression and topic modeling," Journal of Informetrics, Elsevier, vol. 10(2), pages 634-644.
    16. Deng, Weibing & Pato, Mauricio Porto, 2017. "Approaching word length distribution via level spectra," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 481(C), pages 167-175.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Cerqueti, Roy & Lupi, Claudio & Pietrovito, Filomena & Pozzolo, Alberto Franco, 2022. "Rank–size distributions for banks: A cross-country analysis," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 585(C).
    2. Valerio Ficcadenti & Roy Cerqueti & Ciro Hosseini Varde’i, 2023. "A rank-size approach to analyse soccer competitions and teams: the case of the Italian football league “Serie A"," Annals of Operations Research, Springer, vol. 325(1), pages 85-113, June.
    3. Chacoma, Andrés & Zanette, Damián H., 2021. "Word frequency–rank relationship in tagged texts," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 574(C).
    4. Cerqueti, Roy & Ficcadenti, Valerio, 2022. "Combining rank-size and k-means for clustering countries over the COVID-19 new deaths per million," Chaos, Solitons & Fractals, Elsevier, vol. 158(C).

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Cerqueti, Roy & Lupi, Claudio & Pietrovito, Filomena & Pozzolo, Alberto Franco, 2022. "Rank–size distributions for banks: A cross-country analysis," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 585(C).
    2. Vieira, Denner S. & Picoli, Sergio & Mendes, Renio S., 2018. "Robustness of sentence length measures in written texts," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 506(C), pages 749-754.
    3. Ausloos, M., 2012. "Measuring complexity with multifractals in texts. Translation effects," Chaos, Solitons & Fractals, Elsevier, vol. 45(11), pages 1349-1357.
    4. Claudiu Herteliu & Marcel Ausloos & Bogdan Vasile Ileanu & Giulia Rotundo & Tudorel Andrei, 2017. "Quantitative and Qualitative Analysis of Editor Behavior through Potentially Coercive Citations," Publications, MDPI, vol. 5(2), pages 1-16, June.
    5. Serge Galam, 2011. "Tailor based allocations for multiple authorship: a fractional gh-index," Scientometrics, Springer;Akadémiai Kiadó, vol. 89(1), pages 365-379, October.
    6. Ausloos, Marcel, 2015. "Coherent measures of the impact of co-authors in peer review journals and in proceedings publications," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 438(C), pages 568-578.
    7. Rotundo, Giulia, 2014. "Black–Scholes–Schrödinger–Zipf–Mandelbrot model framework for improving a study of the coauthor core score," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 404(C), pages 296-301.
    8. Stanisz, Tomasz & Drożdż, Stanisław & Kwapień, Jarosław, 2023. "Universal versus system-specific features of punctuation usage patterns in major Western languages," Chaos, Solitons & Fractals, Elsevier, vol. 168(C).
    9. Hassan Bougrine, 2014. "Subfield effects on the core of coauthors," Scientometrics, Springer;Akadémiai Kiadó, vol. 98(2), pages 1047-1064, February.
    10. Tingcan Ma & Gui-Fang Wang & Ke Dong & Mukun Cao, 2012. "The Journal’s Integrated Impact Index: a new indicator for journal evaluation," Scientometrics, Springer;Akadémiai Kiadó, vol. 90(2), pages 649-658, February.
    11. Valerio Ficcadenti & Roy Cerqueti & Ciro Hosseini Varde’i, 2023. "A rank-size approach to analyse soccer competitions and teams: the case of the Italian football league “Serie A"," Annals of Operations Research, Springer, vol. 325(1), pages 85-113, June.
    12. Ausloos, Marcel, 2020. "Rank–size law, financial inequality indices and gain concentrations by cyclist teams. The case of a multiple stage bicycle race, like Tour de France," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 540(C).
    13. Marcel Ausloos, 2014. "Binary scientific star coauthors core size," Scientometrics, Springer;Akadémiai Kiadó, vol. 99(2), pages 331-351, May.
    14. Marcel Ausloos & Roy Cerqueti, 2016. "Studies on Regional Wealth Inequalities: the case of Italy," Papers 1602.05356, arXiv.org.
    15. J. E. Hirsch, 2019. "hα: An index to quantify an individual’s scientific leadership," Scientometrics, Springer;Akadémiai Kiadó, vol. 118(2), pages 673-686, February.
    16. Miśkiewicz, Janusz, 2013. "Effects of publications in proceedings on the measure of the core size of coauthors," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 392(20), pages 5119-5131.
    17. Dašić Predrag, 2015. "State and Analysis of Scientific Journals in the Field of “Economic Sciences” for the Period 1995-2014," Economic Themes, Sciendo, vol. 53(4), pages 547-581, December.
    18. Cerqueti, Roy & Ficcadenti, Valerio, 2022. "Combining rank-size and k-means for clustering countries over the COVID-19 new deaths per million," Chaos, Solitons & Fractals, Elsevier, vol. 158(C).
    19. Liu, Xuan Zhen & Fang, Hui, 2012. "Modifying h-index by allocating credit of multi-authored papers whose author names rank based on contribution," Journal of Informetrics, Elsevier, vol. 6(4), pages 557-565.
    20. Nicola Giuseppe Castellano & Roy Cerqueti & Bruno Maria Franceschetti, 2021. "Evaluating risks-based communities of Mafia companies: a complex networks perspective," Review of Quantitative Finance and Accounting, Springer, vol. 57(4), pages 1463-1486, November.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:infome:v:14:y:2020:i:3:s1751157719303037. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/joi .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.