IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0007678.html
   My bibliography  Save this article

Beyond Word Frequency: Bursts, Lulls, and Scaling in the Temporal Distributions of Words

Author

Listed:
  • Eduardo G Altmann
  • Janet B Pierrehumbert
  • Adilson E Motter

Abstract

Background: Zipf's discovery that word frequency distributions obey a power law established parallels between biological and physical processes, and language, laying the groundwork for a complex systems perspective on human communication. More recent research has also identified scaling regularities in the dynamics underlying the successive occurrences of events, suggesting the possibility of similar findings for language as well. Methodology/Principal Findings: By considering frequent words in USENET discussion groups and in disparate databases where the language has different levels of formality, here we show that the distributions of distances between successive occurrences of the same word display bursty deviations from a Poisson process and are well characterized by a stretched exponential (Weibull) scaling. The extent of this deviation depends strongly on semantic type – a measure of the logicality of each word – and less strongly on frequency. We develop a generative model of this behavior that fully determines the dynamics of word usage. Conclusions/Significance: Recurrence patterns of words are well described by a stretched exponential distribution of recurrence times, an empirical scaling that cannot be anticipated from Zipf's law. Because the use of words provides a uniquely precise and powerful lens on human thought and activity, our findings also have implications for other overt manifestations of collective human dynamics.

Suggested Citation

  • Eduardo G Altmann & Janet B Pierrehumbert & Adilson E Motter, 2009. "Beyond Word Frequency: Bursts, Lulls, and Scaling in the Temporal Distributions of Words," PLOS ONE, Public Library of Science, vol. 4(11), pages 1-7, November.
  • Handle: RePEc:plo:pone00:0007678
    DOI: 10.1371/journal.pone.0007678
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0007678
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0007678&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pone.0007678?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Kumiko Tanaka-Ishii & Armin Bunde, 2016. "Long-Range Memory in Literary Texts: On the Universal Clustering of the Rare Words," PLOS ONE, Public Library of Science, vol. 11(11), pages 1-14, November.
    2. Heng Chen & Haitao Liu, 2018. "Quantifying Evolution of Short and Long-Range Correlations in Chinese Narrative Texts across 2000 Years," Complexity, Hindawi, vol. 2018, pages 1-12, February.
    3. Chen, Yanguang, 2012. "Zipf’s law, 1/f noise, and fractal hierarchy," Chaos, Solitons & Fractals, Elsevier, vol. 45(1), pages 63-73.
    4. Yuan, Qianshun & Semba, Sherehe & Zhang, Jing & Weng, Tongfeng & Gu, Changgui & Yang, Huijie, 2021. "Multi-scale transition matrix approach to time series," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 578(C).
    5. Yue Yang & Changgui Gu & Qin Xiao & Huijie Yang, 2017. "Evolution of scaling behaviors embedded in sentence series from A Story of the Stone," PLOS ONE, Public Library of Science, vol. 12(2), pages 1-14, February.
    6. Rashidisabet, Homa & Ajilore, Olusola & Leow, Alex & Demos, Alexander P., 2022. "Revisiting power-law estimation with applications to real-world human typing dynamics," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 599(C).
    7. Karain, Wael I., 2019. "Investigating large-amplitude protein loop motions as extreme events using recurrence interval analysis," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 520(C), pages 1-10.
    8. Criado-Alonso, Ángeles & Aleja, David & Romance, Miguel & Criado, Regino, 2022. "Derivative of a hypergraph as a tool for linguistic pattern analysis," Chaos, Solitons & Fractals, Elsevier, vol. 163(C).
    9. Serguei Saavedra & Jordi Duch & Brian Uzzi, 2011. "Tracking Traders' Understanding of the Market Using e-Communication Data," PLOS ONE, Public Library of Science, vol. 6(10), pages 1-7, October.
    10. Vieira, Denner S. & Picoli, Sergio & Mendes, Renio S., 2018. "Robustness of sentence length measures in written texts," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 506(C), pages 749-754.
    11. Cui, Xue-Mei & Yoon, Chang No & Youn, Hyejin & Lee, Sang Hoon & Jung, Jean S. & Han, Seung Kee, 2017. "Dynamic burstiness of word-occurrence and network modularity in textbook systems," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 487(C), pages 103-110.
    12. Zörnig, Peter, 2010. "Statistical simulation and the distribution of distances between identical elements in a random sequence," Computational Statistics & Data Analysis, Elsevier, vol. 54(10), pages 2317-2327, October.
    13. Chen, Yanguang & Wang, Jiejing, 2014. "Recursive subdivision of urban space and Zipf’s law," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 395(C), pages 392-404.
    14. Chen, Yanguang, 2012. "The mathematical relationship between Zipf’s law and the hierarchical scaling law," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 391(11), pages 3285-3299.
    15. Ghosh, Dipak & Chakraborty, Sayantan & Samanta, Shukla, 2019. "Study of translational effect in Tagore’s Gitanjali using Chaos based Multifractal analysis technique," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 523(C), pages 1343-1354.
    16. Shuntaro Takahashi & Kumiko Tanaka-Ishii, 2017. "Do neural nets learn statistical laws behind natural language?," PLOS ONE, Public Library of Science, vol. 12(12), pages 1-17, December.
    17. Eduardo G Altmann & Janet B Pierrehumbert & Adilson E Motter, 2011. "Niche as a Determinant of Word Fate in Online Groups," PLOS ONE, Public Library of Science, vol. 6(5), pages 1-12, May.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0007678. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.