Two Regimes in the Frequency of Words and the Origins of Complex Lexicons: Zipf's Law Revisited
Zipf's law states that the frequency of a word is a power function of its rank. The exponent of the power is usually accepted to be close to (-)1. Great deviations between the predicted and real number of different words of a text, disagreements between the predicted and real exponent of the probability density function and statistics on a big corpus, make evident that word frequency as a function of the rank follows two different exponents, \approx (-)1 for the first regime and \approx (-)2 for the second. The implications of the change in exponents for the metrics of texts and for the origins of complex lexicons are analyzed.
1. Check below under "Related research" whether another version of this item is available online.
2. Check on the provider's web page whether it is in fact available.
3. Perform a search for a similarly titled item that would be available.
|Date of creation:||Dec 2000|
|Date of revision:|
|Contact details of provider:|| Postal: 1399 Hyde Park Road, Santa Fe, New Mexico 87501|
Web page: http://www.santafe.edu/sfi/publications/working-papers.html
More information through EDIRC
When requesting a correction, please mention this item's handle: RePEc:wop:safiwp:00-12-068. See general information about how to correct material in RePEc.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: (Thomas Krichel)
If references are entirely missing, you can add them using this form.