Zipf's law and the distribution of cities
Zipf’s law, rather remarkably, applies to many other situations, not just words in a language. Whenever we have a natural ranking by size, or frequency, of certain kinds of objects or situations, we can ask about relative sizes and frequencies, and how they distribute.
In this step we look at ranks of cities in a given country, and find that Zipf’s law sometimes applies quite well.
Zipf’s law and populations of cities
The same kind of inverse distribution as described in Zipf’s law for frequencies of common words is found in widely different situations. It has been noticed that it also tends to hold when we look at the biggest cities in a country by population. Now it is certainly more problematic to give accurate numbers here, on account of the obvious difficulty that the boundaries of cities are somewhat arbitrary. Nevertheless, if we take some standard listing of cities in various countries, what do we find?
Largest cities in the UK
Here is a ranking of the top cities in the UK by population. One could argue that London is a rather special situation here, as it is generally much bigger than might be expected.
|1||London, Capital Region||8,445,066|
|2||Birmingham, West Midlands||1,224,136|
|6||Liverpool, Lancashire and Cheshire||552,267|
Clearly London is a bit of an anomaly. Then there is a spread between the next four or five cities, and then there are a few bunched together.
Largest cities in the USA
Here is a ranking of the largest cities in the USA.
|1||New York, N.Y.||8,491,079|
|2||Los Angeles, Calif.||3,928,864|
|7||San Antonio, Tex.||1,436,697|
|8||San Diego, Calif.||1,381,069|
|10||San Jose, Calif.||1,015,785|
We see a distribution much more like an inverse relation between rank and relative population.
Largest cities in Russia
Here is a ranking of the top cities in Russia:
So what conclusions can we draw?
It appears that Zipf’s law is alive and well when applied to American cities: Chicago for example is the third largest city at 2.7 million, roughly a third of the size of the biggest city, New York, at 8.5 million. But for the UK, the dominance of London tends to obscure the pattern. In Russia, after Moscow and Saint Petersburg there is not much of a spread between the next eight cities. Clearly we need to look at some more countries before we can come to any firm conclusions, but there is evidence to suggest that the law is a reasonably accurate one.
Zipf’s law in other contexts
The appearance of the distribution in rankings of cities by population was first noticed by Felix Auerbach in 1913. Curiously, it turns out that this same phenomenon appears to occur in many other rankings, such as the ranks of number of people watching the same TV channel, income distributions, etc.
Why does Zipf’s law work so well? The answer is not entirely clear. Paul Krugman, the noted economist, remarked that;
“The usual complaint about economic theory is that our models are oversimplified — that they offer excessively neat views of complex, messy reality. With Zipf’s law the reverse is true: we have complex, messy models, yet reality is startlingly neat and simple.”
Nevertheless our small samples of cities in countries suggests that some skepticism is also perhaps warranted, and that the applicability of the law, while large, may not be as all embracing as sometimes claimed.
Does Zipf’s law hold, at least approximately, in your country?
How about if we look not at the largest cities in your country, but the largest businesses or firms in your country? Is there any data to indicate that something like Zipf’s law holds? How about other kinds of data where things/people/organisations are naturally ranked with regard to a natural parameter?
Please let us know what you find out!
© UNSW Australia 2015