IDEAS home Printed from https://ideas.repec.org/a/plo/pcbi00/1005110.html
   My bibliography  Save this article

Zipf’s Law Arises Naturally When There Are Underlying, Unobserved Variables

Author

Listed:
  • Laurence Aitchison
  • Nicola Corradi
  • Peter E Latham

Abstract

Zipf’s law, which states that the probability of an observation is inversely proportional to its rank, has been observed in many domains. While there are models that explain Zipf’s law in each of them, those explanations are typically domain specific. Recently, methods from statistical physics were used to show that a fairly broad class of models does provide a general explanation of Zipf’s law. This explanation rests on the observation that real world data is often generated from underlying causes, known as latent variables. Those latent variables mix together multiple models that do not obey Zipf’s law, giving a model that does. Here we extend that work both theoretically and empirically. Theoretically, we provide a far simpler and more intuitive explanation of Zipf’s law, which at the same time considerably extends the class of models to which this explanation can apply. Furthermore, we also give methods for verifying whether this explanation applies to a particular dataset. Empirically, these advances allowed us extend this explanation to important classes of data, including word frequencies (the first domain in which Zipf’s law was discovered), data with variable sequence length, and multi-neuron spiking activity.Author Summary: Datasets ranging from word frequencies to neural activity all have a seemingly unusual property, known as Zipf’s law: when observations (e.g., words) are ranked from most to least frequent, the frequency of an observation is inversely proportional to its rank. Here we demonstrate that a single, general principle underlies Zipf’s law in a wide variety of domains, by showing that models in which there is a latent, or hidden, variable controlling the observations can, and sometimes must, give rise to Zipf’s law. We illustrate this mechanism in three domains: word frequency, data with variable sequence length, and neural data.

Suggested Citation

  • Laurence Aitchison & Nicola Corradi & Peter E Latham, 2016. "Zipf’s Law Arises Naturally When There Are Underlying, Unobserved Variables," PLOS Computational Biology, Public Library of Science, vol. 12(12), pages 1-32, December.
  • Handle: RePEc:plo:pcbi00:1005110
    DOI: 10.1371/journal.pcbi.1005110
    as

    Download full text from publisher

    File URL: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1005110
    Download Restriction: no

    File URL: https://journals.plos.org/ploscompbiol/article/file?id=10.1371/journal.pcbi.1005110&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pcbi.1005110?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Xavier Gabaix & Parameswaran Gopikrishnan & Vasiliki Plerou & H. Eugene Stanley, 2003. "A theory of power-law distributions in financial market fluctuations," Nature, Nature, vol. 423(6937), pages 267-270, May.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Jan Humplik & Gašper Tkačik, 2017. "Probabilistic models for neural populations that naturally capture global coupling and criticality," PLOS Computational Biology, Public Library of Science, vol. 13(9), pages 1-26, September.
    2. Patrick Erik Bradley & Martin Behnisch, 2019. "Heavy-tailed distributions for building stock data," Environment and Planning B, , vol. 46(7), pages 1281-1296, September.
    3. Mark L Ioffe & Michael J Berry II, 2017. "The structured ‘low temperature’ phase of the retinal population code," PLOS Computational Biology, Public Library of Science, vol. 13(10), pages 1-31, October.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Abduraimova, Kumushoy, 2022. "Contagion and tail risk in complex financial networks," Journal of Banking & Finance, Elsevier, vol. 143(C).
    2. Jean-Philippe Bouchaud & Julien Kockelkoren & Marc Potters, 2006. "Random walks, liquidity molasses and critical response in financial markets," Quantitative Finance, Taylor & Francis Journals, vol. 6(2), pages 115-123.
    3. Juan C. Henao-Londono & Sebastian M. Krause & Thomas Guhr, 2021. "Price response functions and spread impact in correlated financial markets," The European Physical Journal B: Condensed Matter and Complex Systems, Springer;EDP Sciences, vol. 94(4), pages 1-20, April.
    4. Igor Fedotenkov, 2020. "A Review of More than One Hundred Pareto-Tail Index Estimators," Statistica, Department of Statistics, University of Bologna, vol. 80(3), pages 245-299.
    5. Grzesiek, Aleksandra & Połoczański, Rafał & Kumar, Arun & Wyłomańska, Agnieszka, 2021. "Moment-based estimation for parameters of general inverse subordinator," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 575(C).
    6. Giorgio Fagiolo & Mauro Napoletano & Andrea Roventini, 2008. "Are output growth-rate distributions fat-tailed? some evidence from OECD countries," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 23(5), pages 639-669.
    7. Xiao, Di & Wang, Jun, 2021. "Attitude interaction for financial price behaviours by contact system with small-world network topology," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 572(C).
    8. Peng Yue & Qing Cai & Wanfeng Yan & Wei-Xing Zhou, 2020. "Information flow networks of Chinese stock market sectors," Papers 2004.08759, arXiv.org.
    9. Javier Morales & V'ictor Tercero & Fernando Camacho & Eduardo Cordero & Luis L'opez & F-Javier Almaguer, 2014. "Trend and Fractality Assessment of Mexico's Stock Exchange," Papers 1411.3399, arXiv.org.
    10. Jørgen Vitting Andersen & Ioannis Vrontos & Petros Dellaportas & Serge Galam, 2014. "Communication impacting financial markets," Université Paris1 Panthéon-Sorbonne (Post-Print and Working Papers) halshs-00982959, HAL.
    11. Jovanovic, Franck & Schinckus, Christophe, 2016. "Breaking down the barriers between econophysics and financial economics," International Review of Financial Analysis, Elsevier, vol. 47(C), pages 256-266.
    12. Claudia Canals & Xavier Gabaix & Josep M. Vilarrubia & David Weinstein, 2007. "Trade patterns, trade balances and idiosyncratic shocks," Working Papers 0721, Banco de España.
    13. Nobi, Ashadun & Maeng, Seong Eun & Ha, Gyeong Gyun & Lee, Jae Woo, 2014. "Effects of global financial crisis on network structure in a local stock market," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 407(C), pages 135-143.
    14. Eisenberg, Larry, 2011. "Destabilizing properties of a VaR or probability-of-ruin constraint when variances may be infinite," Journal of Financial Stability, Elsevier, vol. 7(1), pages 10-18, January.
    15. Erindi Allaj, 2014. "Risk measuring under liquidity risk," Papers 1412.6745, arXiv.org.
    16. J. Doyne Farmer & Austin Gerig & Fabrizio Lillo & Henri Waelbroeck, 2013. "How efficiency shapes market impact," Quantitative Finance, Taylor & Francis Journals, vol. 13(11), pages 1743-1758, November.
    17. Philipp Weber & Bernd Rosenow, 2006. "Large stock price changes: volume or liquidity?," Quantitative Finance, Taylor & Francis Journals, vol. 6(1), pages 7-14.
    18. Xavier Gabaix & Augustin Landier, 2008. "Why has CEO Pay Increased So Much?," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 123(1), pages 49-100.
    19. Changtai Li & Weihong Huang & Wei-Siang Wang & Wai-Mun Chia, 2023. "Price Change and Trading Volume: Behavioral Heterogeneity in Stock Market," Computational Economics, Springer;Society for Computational Economics, vol. 61(2), pages 677-713, February.
    20. Andrew Balthrop, 2016. "Power laws in oil and natural gas production," Empirical Economics, Springer, vol. 51(4), pages 1521-1539, December.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcbi00:1005110. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ploscompbiol (email available below). General contact details of provider: https://journals.plos.org/ploscompbiol/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.