Skip to 0 minutes and 9 seconds Now that you know about machine learning and data mining we will zoom in on self-organizing maps. And I’m going to tell you more about how self-organizing maps work. Self-organizing maps have been invented by Kohonen in the early 1980s. They’re a type of artificial neuron network. And they are normally applied in an unsupervised way. What I will try to do is apply self-organizing maps for clustering.
Skip to 0 minutes and 46 seconds The word map doesn’t actually refer to a real GIS type of map. The map in self-organizing map is a lattice, a lattice of neurons like you see here. This is the lattice and these are the different neurons.
Skip to 1 minute and 8 seconds Now when we start the process of using this self-organizing map what we’ll do is we’ll have to fill every neuron with a factor of weights. Now such a factor of weights can have any length and is simply a vector containing different values. Maybe they are different variables that you have for a certain geographic location. But it can also be a time series. Maybe a series, let’s say, the number of measles cases in each of your health centers.
Skip to 1 minute and 50 seconds The first you have to understand about self-organizing maps is that they are applied in two different modes. Training mode and mapping mode. I’ll first tell you how training mode works. The first step in the training is actually we fill each of these neurons with a random sample from our training data set. And you see here the training data sets, perhaps these are my health centers. And this is the time series. I randomly take one and put them into these neurons. What I’m going to get is this. Each neuron contains a certain time series.
Skip to 2 minutes and 43 seconds Now the next step in the training is that I, again, am going to present the vectors from my training set to this lattice of neurons in order to train it. That will be done as follows. I present one of my vectors. And the self-organizing map will identify the best matching unit, the vector that most closely represents my sample.
Skip to 3 minutes and 19 seconds The next step is that that particular sample is actually adjusted. You see that this factor now has a tiny little bump. It doesn’t only adjust to the vector in the best matching neuron, but it also applies this transformation to the vectors in the surrounding neurons. So what you will see is that initially it will, for example, in this area adjust all the factors. But during the process that area will become smaller and smaller. That is because we do not performance this only one time, but the training is an iterative process.
Skip to 4 minutes and 11 seconds Every time I take a sample from my training set, present it to my lattice, it will be identify the best matching unit, it will adjust it, but it will also adjust the surrounding area. And then when my number of iterations grows that area becomes smaller.
Skip to 4 minutes and 36 seconds Now after the training process I have a trained lattice. And that might look something like you see here. And what you see is that every neuron contains such a vector of weights and actually that the neurons that are close together have very similar vectors. That is because in the training process it doesn’t only adjust the best matching unit, but also it’s surrounding.
Skip to 5 minutes and 16 seconds Now sometimes or I must say quite often what we do is after training the lattice we apply a secondary clustering, grouping similar neurons or similar vectors together. That will greatly reduce the number of classes that we actually generate. There are different methods to do that. Different clustering techniques can be applied. But that is not part of this presentation. OK. We have a trained lattice. We have presented our training data to that lattice. And we have adjusted it to create the trained lattice. Now the next step is the mapping. What we want to do is we want to take a sample data set, present it, and map the data.
Skip to 6 minutes and 19 seconds The data set could be identical to your training data set, but it can also be a completely different one or perhaps different subsets of your training data.
Skip to 6 minutes and 36 seconds Now when we do that we take one of the samples. We present it, identify the best matching unit, and that will be the mapping unit. So in this case, what you see is that these neurons have numbers. Numbering goes from 1 here, 7 there, 8, 9, 10, 14, et cetera. So when we say, OK, this is the best matching unit. We know that we can put this vector in the neuron 19 class. After presenting the complete data set you can see that we could map that data set onto that lattice and that every sample would be in the best matching neuron.
Skip to 7 minutes and 32 seconds Now what we can do– and this is our example of measles in Iceland. What we have is we can see here that we have small dots on this map with numbers attached. These are actually our health centers. For each of them we have a time series ranging from 1946 up till 1970. During this period, we had several epidemic outbreaks. Now what we can do is we can train the lattice then represent each of these health units individually, map them onto the trained lattice to identify the class. Now when we do this several different elements are taken into account. If this would be our time series and this would be another time series, you can see they have a different amplitude.
Skip to 8 minutes and 38 seconds They can have a different number of measles cases. The different timing in peaks or onset of the epidemic. Or different duration of the peak. All these elements are taken into account. Now what we get then is that we can map this information back to space. We have identified the best matching neuron. You see here that the neuron 12– that is this one– is actually represented in purple. We have only one health center, Reykjavik, maps to this side of our lattice. Now what we also see is that there is quite a large group of green points here. Green points are health centers that are mapped to neurons 8, 9, or 11.
Skip to 9 minutes and 40 seconds And what you see is that they are very close to Reykjavik. So actually, we see that a spatial pattern emerged that we were previously not aware of.
Self-Organizing Maps (SOMs)
In this video you will be introduced to self-organizing maps, or so-called SOMs. It explains what SOMs are and how they typically work.
SOMs are a type of artificial neuron network. The word “map” doesn’t actually refer to a real GIS type of map. The map in self-organizing map is a lattice. In this video you will learn about the process of using a self-organizing map. They are typically applied in two different modes. Training mode and mapping mode.
The video will also introduce you to the case study used in this in-depth topic. This is a case study on spatial-temporal diffusion of Measles in Iceland. If you want to learn more about this topic you can check the article referenced below.
Augustijn, E.-W. and R. Zurita-Milla (2013). “Self-organizing maps as an approach to exploring spatiotemporal diffusion patterns.” International Journal of Health Geographics 12(1): 60.
© University of Twente