IDEAS home Printed from https://ideas.repec.org/a/eee/infome/v1y2007i4p277-286.html
   My bibliography  Save this article

Word statistics in Blogs and RSS feeds: Towards empirical universal evidence

Author

Listed:
  • Lambiotte, R.
  • Ausloos, M.
  • Thelwall, M.

Abstract

We focus on the statistics of word occurrences and of the waiting times between such occurrences in Blogs. Due to the heterogeneity of words’ frequencies, the empirical analysis is performed by studying classes of “frequently-equivalent” words, i.e. by grouping words depending on their frequencies. Two limiting cases are considered: the dilute limit, i.e. for those words that are used less than once a day, and the dense limit for frequent words. In both cases, extreme events occur more frequently than expected from the Poisson hypothesis. These deviations from Poisson statistics reveal non-trivial time correlations between events that are associated with bursts of activities. The distribution of waiting times is shown to behave like a stretched exponential and to have the same shape for different sets of words sharing a common frequency, thereby revealing universal features.

Suggested Citation

  • Lambiotte, R. & Ausloos, M. & Thelwall, M., 2007. "Word statistics in Blogs and RSS feeds: Towards empirical universal evidence," Journal of Informetrics, Elsevier, vol. 1(4), pages 277-286.
  • Handle: RePEc:eee:infome:v:1:y:2007:i:4:p:277-286
    DOI: 10.1016/j.joi.2007.07.001
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S1751157707000582
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.joi.2007.07.001?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to

    for a different version of it.

    References listed on IDEAS

    as
    1. Telesca, Luciano & Lovallo, Michele, 2006. "Are global terrorist attacks time-correlated?," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 362(2), pages 480-484.
    2. T. S. Evans, 2007. "Exact solutions for network rewiring models," The European Physical Journal B: Condensed Matter and Complex Systems, Springer;EDP Sciences, vol. 56(1), pages 65-69, March.
    3. Ebeling, Werner & Neiman, Alexander, 1995. "Long-range correlations between letters and sentences in texts," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 215(3), pages 233-241.
    4. Kan, Kamhon & Fu, Tsu-Tan, 1997. "Analysis Of Housewives' Grocery Shopping Behavior In Taiwan: An Application Of The Poisson Switching Regression," Journal of Agricultural and Applied Economics, Southern Agricultural Economics Association, vol. 29(2), pages 1-11, December.
    5. Mike Thelwall & Rudy Prabowo & Ruth Fairclough, 2006. "Are raw RSS feeds suitable for broad issue scanning? A science concern case study," Journal of the American Society for Information Science and Technology, Association for Information Science & Technology, vol. 57(12), pages 1644-1654, October.
    6. Montemurro, Marcelo A., 2001. "Beyond the Zipf–Mandelbrot law in quantitative linguistics," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 300(3), pages 567-578.
    7. Ronald Rousseau, 2002. "Lack of standardisation in informetric research. Comments on “Power laws of research output. Evidence for journals of economics” by Matthias Sutter and Martin G. Kocher," Scientometrics, Springer;Akadémiai Kiadó, vol. 55(2), pages 317-327, August.
    8. Ausloos, M. & Lambiotte, R., 2006. "Time-evolving distribution of time lags between commercial airline disasters," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 362(2), pages 513-524.
    9. Lambiotte, R. & Ausloos, M., 2006. "Endo- vs. exogenous shocks and relaxation rates in book and music “sales”," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 362(2), pages 485-494.
    10. Lucien Benguigui & Efrat Blumenfeld-Lieberthal, 2006. "From Lognormal Distribution To Power Law: A New Classification Of The Size Distributions," International Journal of Modern Physics C (IJMPC), World Scientific Publishing Co. Pte. Ltd., vol. 17(10), pages 1429-1436.
    11. van Raan, Anthony F.J., 2001. "Two-step competition process leads to quasi power-law income distributions," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 298(3), pages 530-536.
    12. Gopikrishnan, P. & Plerou, V. & Gabaix, X. & Amaral, L.A.N. & Stanley, H.E., 2001. "Price fluctuations and market activity," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 299(1), pages 137-143.
    13. V. Plerou & P. Gopikrishnan & X. Gabaix & L. A. N. Amaral & H. E. Stanley, 2001. "Price fluctuations, market activity and trading volume," Quantitative Finance, Taylor & Francis Journals, vol. 1(2), pages 262-269.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Chen, Long-Sheng & Liu, Cheng-Hsiang & Chiu, Hui-Ju, 2011. "A neural network based approach for sentiment classification in the blogosphere," Journal of Informetrics, Elsevier, vol. 5(2), pages 313-322.
    2. Yukie Sano & Misako Takayasu, 2010. "Macroscopic and microscopic statistical properties observed in blog entries," Journal of Economic Interaction and Coordination, Springer;Society for Economic Science with Heterogeneous Interacting Agents, vol. 5(2), pages 221-230, December.
    3. Ficcadenti, Valerio & Cerqueti, Roy & Ausloos, Marcel & Dhesi, Gurjeet, 2020. "Words ranking and Hirsch index for identifying the core of the hapaxes in political texts," Journal of Informetrics, Elsevier, vol. 14(3).
    4. Ausloos, M., 2012. "Measuring complexity with multifractals in texts. Translation effects," Chaos, Solitons & Fractals, Elsevier, vol. 45(11), pages 1349-1357.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Kononovicius, A. & Ruseckas, J., 2015. "Nonlinear GARCH model and 1/f noise," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 427(C), pages 74-81.
    2. Chen, Zhimin & Ibragimov, Rustam, 2019. "One country, two systems? The heavy-tailedness of Chinese A- and H- share markets," Emerging Markets Review, Elsevier, vol. 38(C), pages 115-141.
    3. Andria, Joseph & di Tollo, Giacomo & Kalda, Jaan, 2022. "The predictive power of power-laws: An empirical time-arrow based investigation," Chaos, Solitons & Fractals, Elsevier, vol. 162(C).
    4. Wei-Xing Zhou, 2012. "Universal price impact functions of individual trades in an order-driven market," Quantitative Finance, Taylor & Francis Journals, vol. 12(8), pages 1253-1263, June.
    5. Gabaix, Xavier & Gopikrishnan, Parameswaran & Plerou, Vasiliki & Eugene Stanley, H., 2008. "Quantifying and understanding the economics of large financial movements," Journal of Economic Dynamics and Control, Elsevier, vol. 32(1), pages 303-319, January.
    6. Zheng, Zeyu & Gui, Jun & Qiao, Zhi & Fu, Yang & Stanley, H.Eugene & Li, Baowen, 2019. "New dynamics between volume and volatility," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 525(C), pages 1343-1350.
    7. Aslam, Faheem & Zil-e-huma, & Bibi, Rashida & Ferreira, Paulo, 2022. "Cross-correlations between economic policy uncertainty and precious and industrial metals: A multifractal cross-correlation analysis," Resources Policy, Elsevier, vol. 75(C).
    8. Aslam, Faheem & Aziz, Saqib & Nguyen, Duc Khuong & Mughal, Khurrum S. & Khan, Maaz, 2020. "On the efficiency of foreign exchange markets in times of the COVID-19 pandemic," Technological Forecasting and Social Change, Elsevier, vol. 161(C).
    9. Restocchi, Valerio & McGroarty, Frank & Gerding, Enrico, 2019. "Statistical properties of volume and calendar effects in prediction markets," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 523(C), pages 1150-1160.
    10. Taisei Kaizoji, 2013. "Modeling of Stock Returns and Trading Volume," Papers 1309.2416, arXiv.org.
    11. Chen, Shu-Peng & He, Ling-Yun, 2010. "Multifractal spectrum analysis of nonlinear dynamical mechanisms in China’s agricultural futures markets," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 389(7), pages 1434-1444.
    12. Aleksejus Kononovicius & Julius Ruseckas, 2014. "Nonlinear GARCH model and 1/f noise," Papers 1412.6244, arXiv.org, revised Feb 2015.
    13. Lux, Thomas, 2006. "Financial power laws: Empirical evidence, models, and mechanism," Economics Working Papers 2006-12, Christian-Albrechts-University of Kiel, Department of Economics.
    14. Gontis, V. & Kononovicius, A., 2017. "Burst and inter-burst duration statistics as empirical test of long-range memory in the financial markets," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 483(C), pages 266-272.
    15. Troy Tassier, 2013. "Handbook of Research on Complexity, by J. Barkley Rosser, Jr. and Edward Elgar," Eastern Economic Journal, Palgrave Macmillan;Eastern Economic Association, vol. 39(1), pages 132-133.
    16. Yonatan Berman & Yoash Shapira & Eshel Ben-Jacob, 2014. "Unraveling Hidden Order in the Dynamics of Developed and Emerging Markets," PLOS ONE, Public Library of Science, vol. 9(11), pages 1-10, November.
    17. Gontis, V. & Kononovicius, A., 2017. "Burst and inter-burst duration statistics as empirical test of long-range memory in the financial markets," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 483(C), pages 266-272.
    18. Taisei Kaizoji, 2013. "Modelling of Stock Returns and Trading Volume," IIM Kozhikode Society & Management Review, , vol. 2(2), pages 147-155, July.
    19. Araújo, Tanya & Dias, João & Eleutério, Samuel & Louçã, Francisco, 2013. "A measure of multivariate kurtosis for the identification of the dynamics of a N-dimensional market," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 392(17), pages 3708-3714.
    20. Victor Olkhov, 2020. "Volatility Depends on Market Trades and Macro Theory," Papers 2008.07907, arXiv.org, revised Jun 2024.

    More about this item

    Keywords

    ;
    ;
    ;
    ;

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:infome:v:1:y:2007:i:4:p:277-286. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/joi .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.