IDEAS home Printed from https://ideas.repec.org/a/eee/phsmap/v512y2018icp305-315.html
   My bibliography  Save this article

Benford’s law and first letter of words

Author

Listed:
  • Yan, Xiaoyong
  • Yang, Seong-Gyu
  • Kim, Beom Jun
  • Minnhagen, Petter

Abstract

A universal First-Letter Law (FLL) is derived and described. It predicts the percentages of first letters for words in novels. The FLL is akin to Benford’s law (BL) of first digits, which predicts the percentages of first digits in a data collection of numbers. Both are universal in the sense that FLL only depends on the numbers of letters in the alphabet, whereas BL only depends on the number of digits in the base of the number system. The existence of these types of universal laws appears counter-intuitive. Nonetheless both describe data very well. Relations to some earlier works are given. FLL predicts that an English author on the average starts about 16 out of 100 words with the English letter ‘t’. This is corroborated by data, yet an author can freely write anything. Fuller implications and the applicability of FLL remain for the future.

Suggested Citation

  • Yan, Xiaoyong & Yang, Seong-Gyu & Kim, Beom Jun & Minnhagen, Petter, 2018. "Benford’s law and first letter of words," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 512(C), pages 305-315.
  • Handle: RePEc:eee:phsmap:v:512:y:2018:i:c:p:305-315
    DOI: 10.1016/j.physa.2018.08.133
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0378437118310768
    Download Restriction: Full text for ScienceDirect subscribers only. Journal offers the option of making the article available online on Science direct for a fee of $3,000

    File URL: https://libkey.io/10.1016/j.physa.2018.08.133?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Yan, Xiaoyong & Minnhagen, Petter, 2016. "Randomness versus specifics for word-frequency distributions," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 444(C), pages 828-837.
    2. Yan, Xiaoyong & Minnhagen, Petter, 2018. "The dependence of frequency distributions on multiple meanings of words, codes and signs," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 490(C), pages 554-564.
    3. Fewster, R. M., 2009. "A Simple Explanation of Benford's Law," The American Statistician, American Statistical Association, vol. 63(1), pages 26-32.
    4. Pietronero, L. & Tosatti, E. & Tosatti, V. & Vespignani, A., 2001. "Explaining the uneven distribution of numbers in nature: the laws of Benford and Zipf," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 293(1), pages 297-304.
    5. Yan, Xiaoyong & Minnhagen, Petter & Jensen, Henrik Jeldtoft, 2016. "The likely determines the unlikely," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 456(C), pages 112-119.
    6. Xiaoyong Yan & Petter Minnhagen, 2015. "Maximum Entropy, Word-Frequency, Chinese Characters, and Multiple Meanings," PLOS ONE, Public Library of Science, vol. 10(5), pages 1-19, May.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Lee, Kang-Bok & Han, Sumin & Jeong, Yeasung, 2020. "COVID-19, flattening the curve, and Benford’s law," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 559(C).
    2. Adriano Silva & Sergio Floquet & Ricardo Lima, 2023. "Newcomb–Benford’s Law in Neuromuscular Transmission: Validation in Hyperkalemic Conditions," Stats, MDPI, vol. 6(4), pages 1-19, October.
    3. Jaroslav Petráš & Marek Pavlík & Ján Zbojovský & Ardian Hyseni & Jozef Dudiak, 2023. "Benford’s Law in Electric Distribution Network," Mathematics, MDPI, vol. 11(18), pages 1-27, September.
    4. da Silva, A.J. & Floquet, S. & Santos, D.O.C. & Lima, R.F., 2020. "On the validation of the Newcomb−Benford Law and the Weibull distribution in neuromuscular transmission," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 553(C).

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Yan, Xiaoyong & Minnhagen, Petter, 2018. "The dependence of frequency distributions on multiple meanings of words, codes and signs," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 490(C), pages 554-564.
    2. Biau, Damien, 2015. "The first-digit frequencies in data of turbulent flows," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 440(C), pages 147-154.
    3. Yan, Xiaoyong & Minnhagen, Petter & Jensen, Henrik Jeldtoft, 2016. "The likely determines the unlikely," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 456(C), pages 112-119.
    4. Matthew A. Cole & David J. Maddison & Liyun Zhang, 2020. "Testing the emission reduction claims of CDM projects using the Benford’s Law," Climatic Change, Springer, vol. 160(3), pages 407-426, June.
    5. Villas-Boas, Sofia B. & Fu, Qiuzi & Judge, George, 2017. "Benford’s law and the FSD distribution of economic behavioral micro data," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 486(C), pages 711-719.
    6. Louie Rivers & Tamara Dempsey & Jade Mitchell & Carole Gibbs, 2015. "Environmental Regulation and Enforcement: Structures, Processes and the Use of Data for Fraud Detection," Journal of Environmental Assessment Policy and Management (JEAPM), World Scientific Publishing Co. Pte. Ltd., vol. 17(04), pages 1-29, December.
    7. Hürlimann, Werner, 2015. "On the uniform random upper bound family of first significant digit distributions," Journal of Informetrics, Elsevier, vol. 9(2), pages 349-358.
    8. Sitsofe Tsagbey & Miguel de Carvalho & Garritt L. Page, 2017. "All Data are Wrong, but Some are Useful? Advocating the Need for Data Auditing," The American Statistician, Taylor & Francis Journals, vol. 71(3), pages 231-235, July.
    9. Carlos Velarde & Alberto Robledo, 2017. "Rank distributions: Frequency vs. magnitude," PLOS ONE, Public Library of Science, vol. 12(10), pages 1-13, October.
    10. Gottwald, Georg A. & Nicol, Matthew, 2002. "On the nature of Benford's Law," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 303(3), pages 387-396.
    11. Tariq Ahmad Mir, 2012. "The leading digit distribution of the worldwide Illicit Financial Flows," Papers 1201.3432, arXiv.org, revised Nov 2012.
    12. Clippe, Paulette & Ausloos, Marcel, 2012. "Benford’s law and Theil transform of financial data," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 391(24), pages 6556-6567.
    13. Baumgartner, Tim & Güttler, André, 2022. "Bitcoin flash crash on May 19, 2021: What did really happen on Binance?," IWH Discussion Papers 25/2022, Halle Institute for Economic Research (IWH).
    14. Lee, Joanne & Cho, Wendy K. Tam & Judge, George G., 2010. "Stigler's approach to recovering the distribution of first significant digits in natural data sets," Statistics & Probability Letters, Elsevier, vol. 80(2), pages 82-88, January.
    15. Montag, Josef, 2017. "Identifying odometer fraud in used car market data," Transport Policy, Elsevier, vol. 60(C), pages 10-23.
    16. Adriano Silva & Sergio Floquet & Ricardo Lima, 2023. "Newcomb–Benford’s Law in Neuromuscular Transmission: Validation in Hyperkalemic Conditions," Stats, MDPI, vol. 6(4), pages 1-19, October.
    17. David Giles, 2007. "Benford's law and naturally occurring prices in certain ebaY auctions," Applied Economics Letters, Taylor & Francis Journals, vol. 14(3), pages 157-161.
    18. Diego Jara & Felipe Parra & Alvaro Riascos & Mauricio Romero, 2011. "Análisis digital y detección de elecciones atípicas," Documentos CEDE 9064, Universidad de los Andes, Facultad de Economía, CEDE.
    19. Holz, Carsten A., 2014. "The quality of China's GDP statistics," China Economic Review, Elsevier, vol. 30(C), pages 309-338.
    20. Mir, T.A., 2014. "The Benford law behavior of the religious activity data," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 408(C), pages 1-9.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:phsmap:v:512:y:2018:i:c:p:305-315. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.journals.elsevier.com/physica-a-statistical-mechpplications/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.