Two Regimes in the Frequency of Words and the Origins of Complex Lexicons: Zipf's Law Revisited
AbstractZipf's law states that the frequency of a word is a power function of its rank. The exponent of the power is usually accepted to be close to (-)1. Great deviations between the predicted and real number of different words of a text, disagreements between the predicted and real exponent of the probability density function and statistics on a big corpus, make evident that word frequency as a function of the rank follows two different exponents, \approx (-)1 for the first regime and \approx (-)2 for the second. The implications of the change in exponents for the metrics of texts and for the origins of complex lexicons are analyzed.
Download InfoTo our knowledge, this item is not available for download. To find whether it is available, there are three options:
1. Check below under "Related research" whether another version of this item is available online.
2. Check on the provider's web page whether it is in fact available.
3. Perform a search for a similarly titled item that would be available.
Bibliographic InfoPaper provided by Santa Fe Institute in its series Working Papers with number 00-12-068.
Date of creation: Dec 2000
Date of revision:
Contact details of provider:
Postal: 1399 Hyde Park Road, Santa Fe, New Mexico 87501
Web page: http://www.santafe.edu/sfi/publications/working-papers.html
More information through EDIRC
You can help add them by filling out this form.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: (Thomas Krichel).
If references are entirely missing, you can add them using this form.