Mining for patterns
So now we have a huge pile of data, how do we extract useful information from this pile? It may be hard when you already know what you are looking for but much harder when you have no clue.
Let me tell you a little about my case study. The problem was a very simple one and to be honest, the dataset was small and the complexity of the data was very limited. But what was important is that the approach would also work for far larger datasets and more complex situations.
We were looking for a way to compare the spatial-temporal diffusion patterns of Measles outbreaks on Iceland. Do epidemics always diffuse in a similar way? When this is the case, this information would be very useful for future prediction. Prior to starting this research we thought that perhaps it would be possible that there are differences between epidemics, but that certain medical districts would always behave in a particular way (early or late infection - short or longer infection). This information would also be very useful. It is important to realize that before we started the research we had no idea how we should cluster the data.
We have chosen this case study (Measles in Iceland) for a number of reasons:
- There was a long time series available
- A relatively small dataset makes the development of a method simpler
- Because Iceland is an island the situation (like population) has been stable for a long time
Eventually we decided to use the learning algorithm SOM (Self Organizing Map). The following steps will explain how the analysis were performed. You can also go to the article and read the full text, but before doing so, check the next step that introduces you to data mining, and learning algorithms.
© University of Twente