Skip main navigation

Dimensional Measures

What is the best way to measure distance? In this article, we discuss several methods.

You might have seen Euclidean distance d(x,y)= √((∑(x-y )^2 )) in high school math. It is summing up the squares of two points, after then square rooting the summation. For example, in the XY plane, the distance between x=(1,2) and y=(3,4) is √(〖(1-3)〗^2 + 〖(2-4)〗^2 ) = √8 which is roughly 2.828. We call this kind of distance measure the Euclidean distance. Euclidean distance is probably the most commonly used distance. However, that does not mean it is the only way to measure distance.

There are other ways to measure distance: Manhattan distance, Minkowski distance, standardization distance are some of those. First of all, the Manhattan distance uses absolute values instead of the sum of squares. Let’s use our example to explain this. If we use Manhattan distance instead of Euclidean distance, we will have -(1-3)+(-(2-4)) = 4.

Do you know that the square root of x ( √x ) is, by definition, (x)^(1/2)? (x)^(1/2) is the same expression for √x. The square of (x)^(1/2) is x; it is the definition of the square root. Minkowski distance is something similar, but in Minkowski distance, the exponent is not necessarily 1/2. It can be any number. That is why the exponent is generalized to 1/m in Minkowski distance. m can be any natural number. It can be 1,2,3,4,5,.. and so on.

Standardization distance uses standard deviation. Standard deviation is a measure of how far away a point is from the average of all data. If the standard deviation is high, that means some of the data are far away from the mean. Standardization distance divides Euclidean distance by standard deviation (we denoted using si). It allows us to avoid skewness arising from the difference in variance and scale.

Mahalanobis distance is similar to standardization distance except that Mahalanobis distance also uses correlation along with standard deviation. Correlation measures how closely related the attributes are. You can obtain the correlation by dividing the covariance of two variables with two standard deviations from each variable. The higher the correlation, the more closely related the variables are. Mahalanobis, the renowned statistician, wanted to consider correlation when measuring distance.

This article is from the free online

Artificial Intelligence and Machine Learning for Business

Created by
FutureLearn - Learning For Life

Our purpose is to transform access to education.

We offer a diverse selection of courses from leading universities and cultural institutions from around the world. These are delivered one step at a time, and are accessible on mobile, tablet and desktop, so you can fit learning around your life.

We believe learning should be an enjoyable, social experience, so our courses offer the opportunity to discuss what you’re learning with others as you go, helping you make fresh discoveries and form new ideas.
You can unlock new opportunities with unlimited access to hundreds of online short courses for a year by subscribing to our Unlimited package. Build your knowledge with top universities and organisations.

Learn more about how FutureLearn is transforming access to education