Skip to 0 minutes and 6 secondsWelcome to this video lecture on disease mapping and spatial smoothing. In the coming section, I will explain a bit more about the concept of spatial autocorrelation and how it can be used to quantify disease clustering.

Skip to 0 minutes and 20 secondsThe concept of spatial autocorrelation is directly derived from the first law of geography, which states that everything is related to everything else, but near things are more related than distant things. This is the foremost important concept in geography and in spatial analysis.

Skip to 0 minutes and 37 secondsSo what is it that we actually mean when we say things are spatially autocorrelated? Well, spatial autocorrelation expresses the amount of spatial dependence between areas. It defines how much proximity matters in spatial data, for example. And this is measured as a correlation in space. In other words, it's a variable in one location correlated with the values in nearby places. Hence comes the definition, spatial autocorrelation is a certain variable correlated to itself. It is autocorrelated in space.

Skip to 1 minute and 13 secondsTo show you exactly how spatial autocorrelation is measured, I want to use the following example. Here, you see a map of the Indonesian island of Java divided up into districts. For each district, the standardized tuberculosis suspect rate is shown. Dark red colors indicate high TB suspect rates, while blue colors indicate low TB suspect rates. This plot shows the same data. With on the x-axis, the number of TB suspects within a certain district, and on the y-axis, the average of TB suspects in its neighboring districts. The line indicates the relation between the number of TB suspects in a certain area and in its surroundings.

Skip to 2 minutes and 0 secondsThe strength of the association between the number of TB suspects in a certain area as compared to its surroundings quantifies the strength of autocorrelation. Here, areas with low TB suspect rates are surrounded with areas which also have low TB cesspits rates. And equivalently, areas with high TB suspect rates are also surrounded with other areas which have high TB suspect rates. This correlation is what we mean with spacial autocorrelation.

Skip to 2 minutes and 31 secondsA plot similar to the one shown in the previous slide can be very informative to detect spatial autocorrelation. Here, the plot has been subdivided into four quadrants. At the higher right side of this plot, fall those points which represent areas which have high rates and are surrounded with areas which also have high rates. In the lower left side of the plot fall those areas which have low rates and are surrounded with low rates. These two quadrants are indicative of spatial clustering of high and low values. In contrast, on the upper left side fall those points which have low rates but are surrounded with areas with high rates.

Skip to 3 minutes and 11 secondsThese areas are called spatial low outliers, as they are lower, as we might expect, based on what we find in their direct surroundings. An equivalent on the lower right side, we find points which have high rates and are surrounded with areas with low rates. These are called spatial high outliers.

Skip to 3 minutes and 32 secondsSo now comes the following question. What would we expect to happen to the spatial pattern if we start aggregating data? So here we see again the map of Indonesia divided into districts, each district showing the standardized TB suspect rate. Now what if we aggregate these data up to province? Would we expect the pattern still to exist? Applying these data shows that the spatial clustering when aggregating data is being averaged out, it's not present anymore. Hence, aggregating data into a larger area might conceal spatial patterns which occur in lower levels.

Skip to 4 minutes and 16 secondsThe phenomenon shown on the previous slide is a basic problem when using areal data, which is a question of scaling. Here, the spatial definition of the frontiers, or outlines of areas will impact the actual results. This means that different results will be obtained by just changing the outlines of these zones. This problem is known as the modifiable area unit problem, and is well described in scientific literature. It goes beyond the scope of this course.

Skip to 4 minutes and 49 secondsHere, we have come to the end of this video lecture. In the remainder of this course, you will practice some of the concepts and methods presented in this video lecture by using a third-party software called GeoDA. I hope you've enjoyed this video lecture. Thank you and bye-bye.

# Moran's plot

The concept of spatial autocorrelation was introduced in week 3. Please refer back to section 3.14 if you want to refresh your memory. Dr. Ente Rood from KIT also explains the concept of spatial autocorrelation and shows how it can be used to quantify disease clustering.

The first law of geography states that everything is related to everything else, but near things are more related than distant things. This is the foremost important concept in geography and in spatial analysis. Spatial autocorrelation expresses the amount of spatial dependence between areas.

A Moran’s plot can be very informative to detect spatial autocorrelation and thus clustering. In a Moran’s plot the average rate of the neighbors is plotted against the local observed rate. The plot can be subdivided into four quadrants. At the higher right side of the plot fall those points which represent areas which have high rates and are surrounded with areas which also have high rates. In the lower left side of the plot fall those areas which have low rates and are surrounded with low rates. These two quadrants are indicative of spatial clustering of high and low values. In contrast, on the upper left side fall those points which have low rates but are surrounded it with areas with high rates. These areas are called spatial low outliers, as they are lower than we might expect based on what we find in their direct surroundings. Accordingly, on the lower right side we find points which have high rates and are surrounded with areas with low rates. These are called spacial high outliers.

© KIT (Royal Tropical Institute), Amsterdam, the Netherlands