IDEAS home Printed from https://ideas.repec.org/a/eee/infome/v1y2007i4p277-286.html
   My bibliography  Save this article

Word statistics in Blogs and RSS feeds: Towards empirical universal evidence

Author

Listed:
  • Lambiotte, R.
  • Ausloos, M.
  • Thelwall, M.

Abstract

We focus on the statistics of word occurrences and of the waiting times between such occurrences in Blogs. Due to the heterogeneity of words’ frequencies, the empirical analysis is performed by studying classes of “frequently-equivalent” words, i.e. by grouping words depending on their frequencies. Two limiting cases are considered: the dilute limit, i.e. for those words that are used less than once a day, and the dense limit for frequent words. In both cases, extreme events occur more frequently than expected from the Poisson hypothesis. These deviations from Poisson statistics reveal non-trivial time correlations between events that are associated with bursts of activities. The distribution of waiting times is shown to behave like a stretched exponential and to have the same shape for different sets of words sharing a common frequency, thereby revealing universal features.

Suggested Citation

  • Lambiotte, R. & Ausloos, M. & Thelwall, M., 2007. "Word statistics in Blogs and RSS feeds: Towards empirical universal evidence," Journal of Informetrics, Elsevier, vol. 1(4), pages 277-286.
  • Handle: RePEc:eee:infome:v:1:y:2007:i:4:p:277-286
    DOI: 10.1016/j.joi.2007.07.001
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S1751157707000582
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.joi.2007.07.001?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. T. S. Evans, 2007. "Exact solutions for network rewiring models," The European Physical Journal B: Condensed Matter and Complex Systems, Springer;EDP Sciences, vol. 56(1), pages 65-69, March.
    2. Mike Thelwall & Rudy Prabowo & Ruth Fairclough, 2006. "Are raw RSS feeds suitable for broad issue scanning? A science concern case study," Journal of the American Society for Information Science and Technology, Association for Information Science & Technology, vol. 57(12), pages 1644-1654, October.
    3. Montemurro, Marcelo A., 2001. "Beyond the Zipf–Mandelbrot law in quantitative linguistics," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 300(3), pages 567-578.
    4. Ronald Rousseau, 2002. "Lack of standardisation in informetric research. Comments on “Power laws of research output. Evidence for journals of economics” by Matthias Sutter and Martin G. Kocher," Scientometrics, Springer;Akadémiai Kiadó, vol. 55(2), pages 317-327, August.
    5. Lambiotte, R. & Ausloos, M., 2006. "Endo- vs. exogenous shocks and relaxation rates in book and music “sales”," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 362(2), pages 485-494.
    6. Gopikrishnan, P. & Plerou, V. & Gabaix, X. & Amaral, L.A.N. & Stanley, H.E., 2001. "Price fluctuations and market activity," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 299(1), pages 137-143.
    7. Ebeling, Werner & Neiman, Alexander, 1995. "Long-range correlations between letters and sentences in texts," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 215(3), pages 233-241.
    8. Kan, Kamhon & Fu, Tsu-Tan, 1997. "Analysis of Housewives' Grocery Shopping Behavior in Taiwan: An Application of the Poisson Switching Regression," Journal of Agricultural and Applied Economics, Cambridge University Press, vol. 29(2), pages 397-407, December.
    9. Lucien Benguigui & Efrat Blumenfeld-Lieberthal, 2006. "From Lognormal Distribution To Power Law: A New Classification Of The Size Distributions," International Journal of Modern Physics C (IJMPC), World Scientific Publishing Co. Pte. Ltd., vol. 17(10), pages 1429-1436.
    10. V. Plerou & P. Gopikrishnan & X. Gabaix & L. A. N. Amaral & H. E. Stanley, 2001. "Price fluctuations, market activity and trading volume," Quantitative Finance, Taylor & Francis Journals, vol. 1(2), pages 262-269.
    11. van Raan, Anthony F.J., 2001. "Two-step competition process leads to quasi power-law income distributions," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 298(3), pages 530-536.
    12. Telesca, Luciano & Lovallo, Michele, 2006. "Are global terrorist attacks time-correlated?," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 362(2), pages 480-484.
    13. Ausloos, M. & Lambiotte, R., 2006. "Time-evolving distribution of time lags between commercial airline disasters," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 362(2), pages 513-524.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Chen, Long-Sheng & Liu, Cheng-Hsiang & Chiu, Hui-Ju, 2011. "A neural network based approach for sentiment classification in the blogosphere," Journal of Informetrics, Elsevier, vol. 5(2), pages 313-322.
    2. Yukie Sano & Misako Takayasu, 2010. "Macroscopic and microscopic statistical properties observed in blog entries," Journal of Economic Interaction and Coordination, Springer;Society for Economic Science with Heterogeneous Interacting Agents, vol. 5(2), pages 221-230, December.
    3. Ficcadenti, Valerio & Cerqueti, Roy & Ausloos, Marcel & Dhesi, Gurjeet, 2020. "Words ranking and Hirsch index for identifying the core of the hapaxes in political texts," Journal of Informetrics, Elsevier, vol. 14(3).
    4. Ausloos, M., 2012. "Measuring complexity with multifractals in texts. Translation effects," Chaos, Solitons & Fractals, Elsevier, vol. 45(11), pages 1349-1357.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Chen, Zhimin & Ibragimov, Rustam, 2019. "One country, two systems? The heavy-tailedness of Chinese A- and H- share markets," Emerging Markets Review, Elsevier, vol. 38(C), pages 115-141.
    2. Aslam, Faheem & Zil-e-huma, & Bibi, Rashida & Ferreira, Paulo, 2022. "Cross-correlations between economic policy uncertainty and precious and industrial metals: A multifractal cross-correlation analysis," Resources Policy, Elsevier, vol. 75(C).
    3. Aslam, Faheem & Aziz, Saqib & Nguyen, Duc Khuong & Mughal, Khurrum S. & Khan, Maaz, 2020. "On the efficiency of foreign exchange markets in times of the COVID-19 pandemic," Technological Forecasting and Social Change, Elsevier, vol. 161(C).
    4. Chen, Shu-Peng & He, Ling-Yun, 2010. "Multifractal spectrum analysis of nonlinear dynamical mechanisms in China’s agricultural futures markets," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 389(7), pages 1434-1444.
    5. Aleksejus Kononovicius & Julius Ruseckas, 2014. "Nonlinear GARCH model and 1/f noise," Papers 1412.6244, arXiv.org, revised Feb 2015.
    6. Gontis, V. & Kononovicius, A., 2017. "Burst and inter-burst duration statistics as empirical test of long-range memory in the financial markets," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 483(C), pages 266-272.
    7. Wei-Xing Zhou, 2012. "Universal price impact functions of individual trades in an order-driven market," Quantitative Finance, Taylor & Francis Journals, vol. 12(8), pages 1253-1263, June.
    8. Yonatan Berman & Yoash Shapira & Eshel Ben-Jacob, 2014. "Unraveling Hidden Order in the Dynamics of Developed and Emerging Markets," PLOS ONE, Public Library of Science, vol. 9(11), pages 1-10, November.
    9. Taisei Kaizoji, 2013. "Modelling of Stock Returns and Trading Volume," IIM Kozhikode Society & Management Review, , vol. 2(2), pages 147-155, July.
    10. Araújo, Tanya & Dias, João & Eleutério, Samuel & Louçã, Francisco, 2013. "A measure of multivariate kurtosis for the identification of the dynamics of a N-dimensional market," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 392(17), pages 3708-3714.
    11. Victor Olkhov, 2020. "Volatility Depend on Market Trades and Macro Theory," Papers 2008.07907, arXiv.org.
    12. Paulo L. dos Santos, 2017. "The Principle of Social Scaling," Complexity, Hindawi, vol. 2017, pages 1-9, December.
    13. Olkhov, Victor, 2019. "Econophysics of Asset Price, Return and Multiple Expectations," MPRA Paper 91587, University Library of Munich, Germany.
    14. Lux, Thomas & Alfarano, Simone, 2016. "Financial power laws: Empirical evidence, models, and mechanisms," Chaos, Solitons & Fractals, Elsevier, vol. 88(C), pages 3-18.
    15. Gabaix, Xavier & Gopikrishnan, Parameswaran & Plerou, Vasiliki & Stanley, H.Eugene, 2003. "Understanding the cubic and half-cubic laws of financial fluctuations," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 324(1), pages 1-5.
    16. Vygintas Gontis, 2021. "Order flow in the financial markets from the perspective of the Fractional L\'evy stable motion," Papers 2105.02057, arXiv.org, revised Nov 2021.
    17. Ni, Xiao-Hui & Jiang, Zhi-Qiang & Gu, Gao-Feng & Ren, Fei & Chen, Wei & Zhou, Wei-Xing, 2010. "Scaling and memory in the non-Poisson process of limit order cancelation," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 389(14), pages 2751-2761.
    18. Bar-Ilan, Judit, 2008. "Informetrics at the beginning of the 21st century—A review," Journal of Informetrics, Elsevier, vol. 2(1), pages 1-52.
    19. Jiang, Zhi-Qiang & Zhou, Wei-Xing, 2010. "Complex stock trading network among investors," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 389(21), pages 4929-4941.
    20. Vygintas Gontis, 2023. "Discrete $q$-exponential limit order cancellation time distribution," Papers 2306.00093, arXiv.org, revised Oct 2023.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:infome:v:1:y:2007:i:4:p:277-286. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/joi .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.