Benford's law
It turns out that in many different contexts, when we look at the first significant digit of numbers appearing in various kinds of data, we find that the digits \(\normalsize{1,2,3,4,5,6,7,8}\) and \(\normalsize{9}\) do not all occur with equal frequency. The smaller digits are much more frequent than the larger ones, and there appears to be a systematic relationship between the size of the digit and its likelihood of occurring as a first digit.
In this step we will

learn about this lopsided aspect to the first digits of numbers appearing in reallife data

see how this law, while not an inverse relationship, is closely connected to \(\normalsize{y=1/x}\)

get an idea how this law can be applied in criminal and economic investigations.
Simon Newcomb’s curious observation, and Frank Benford’s investigations
When we look at numbers that appear in reallife data, we find that the digit \(\normalsize{1}\) occurs much more often than the digit \(\normalsize{9}\) as the first digit: in fact it occurs more than \(\normalsize{6}\) times as often. This was first observed by Simon Newcomb, an American astronomer in 1881.
It was put on the map in 1938 by the American physicist Frank Benford, who investigated thousands of occurrences of the law in a wide range of seemingly unrelated examples, from surface areas of rivers, population tables, physical constants, molecular weights, mathematical handbook entries, and even numbers contained in an issue of Reader’s Digest.
Relative frequencies of first digits
Here are the relative frequencies of the familiar nine nonzero digits as the firstmost digit of numbers
\(\normalsize{d}\)  \(\normalsize{P(d)}\) 

1  30.1% 
2  17.6% 
3  12.5% 
4  9.7% 
5  7.9% 
6  6.7% 
7  5.8% 
8  5.1% 
9  4.6% 
Q1 (M): Verify that \(\normalsize{P(1)=P(2)+P(3)}\), that \(\normalsize{P(2)=P(4)+P(5)}\), and that \(\normalsize{P(3)=P(6)+P(7)}\). Can you find another such relation? Does this suggest to you anything that we have already looked at in this course?
Q2 (M): Can you find any other expression that allows one to write \(\normalsize{P(1)}\) as a sum of higher \(\normalsize{P(d)}\)’s?
Relation with the \(\normalsize{y=1/x}\) function
While the graph of \(\normalsize{P(d)}\) as a function of \(\normalsize{d}\) might look like an inverse proportionality, it is not an inverse proportionality. But remarkably it is not very far from it!
In the following figure we see the approximate areas of equally spaced regions under \(\normalsize{y=1/x}\) from \(\normalsize{x=1}\) to \(\normalsize{x=10}\). The total area is approximately \(\normalsize{2.303}\).
Now in the following diagram, we see the relative sizes of these areas, where we have normalized by dividing by the total area from \(\normalsize{x=1}\) to \(\normalsize{x=10}\), namely \(\normalsize{2.303}\). Remarkably, we see exactly the numbers appearing in Benford’s law!
Another way of saying this, using the \(\normalsize{\ln\;x}\) function representing the area under \(\normalsize{y=1/x}\) from \(\normalsize{x=1}\) to \(\normalsize{x}\), is that the probability of the digit \(\normalsize{d=1,2,3,...9}\) is just
\[\Large{P(d)=\frac{\ln (d+1)\ln\, d}{\ln 10}}.\]Applications to detective work
Investigators can use Benford’s law to suggest when data has been manipulated or made up. If you cook up the figures in your tax return, chances are your distribution of digits will be more towards uniform than Benford’s law would suggest. So a computer can sniff out suspectlooking data just by counting digits!
Applications to modern geopolitics
Here is the abstract from the paper Fact and Fiction in EUGovernment Data from the German Economic Review, published in 2011.
To detect manipulations or fraud in accounting data, auditors have successfully used Benford’s law as part of their fraud detection processes. Benford’s law proposes a distribution for first digits of numbers in naturally occurring data. Government accounting and statistics are similar in nature to financial accounting. In the European Union (EU), there is pressure to comply with the Stability and Growth Pact criteria. Therefore, like firms, governments might try to make their economic situation seem better. In this paper, we use a Benford test to investigate the quality of macroeconomic data relevant to the deficit criteria reported to Eurostat by the EU member states. We find that the data reported by Greece shows the greatest deviation from Benford’s law among all euro states.
Fact and Fiction in EUGovernmental Economic Data(German Economic Review 12(3): 243–255)
Discussion
What do you think about Benford’s law? You might like to have a look at some data from some books, or the internet, and make a little tally, and tell us if Benford’s law seems to hold.
Answers
A1. It is also true that \(\normalsize{P(4)=P(8)+P(9)}\). This reminds us of the dilation property of areas under the graph of \(\normalsize{y=1/x}\).
A2. (M): How about \(\normalsize{P(1)=P(4)+P(5)+P(6)+P(7)}\).
© UNSW Australia 2015