Hi, and welcome back. In this lecture, I will show you you can use the dotted chart to explore event logs. The dotted chart is in process mining analysis techniques. And in the evaluation of the results, you get insights on how to process your data further. So what is a dotted chart? Well, it’s mainly dots. And each dot represents a single event. For instance, an event that happened for transportation card 1337 on January 1. The position of this dot is determined by, usually, the card or the trace for which it happened and at what time it happened. So the y-axis is the trace, so every line is a trace. And the x-axis is the time.
And for since this is card 1337, on January 1, the position is determined like this. And of course, other cards and other times are also represented on the axis. So for instance, if you would have a card 2000, for which an event was recorded on January 2, it would be positioned here. A dotted chart therefore, consists of many dots. One dot each representing one event. And the position is determined, usually, by the trace and the time for which it happened. And this gives you an insight on all the events in the event log in a helicopter view. So now let’s switch to the ProM Lite tool, and I’ll show you how it works on real data.
So with ProM Lite open, let’s import the artificial loan example process to explain the basic concepts of the dotted chart. So let’s execute the dotted chart plugin and see what it gives.
Here you see the initial visualization of the dotted chart. And it’s always recommended to set the color attribute to the concept name. And now you see already, nice color patterns appearing. And for instance, you can zoom in by pressing the left mouse button, drawing a rectangle, and when you release it, you zoom in, and you see a nice sequence of events for a particular trace. On the top right, you can expand the legend. And you can see for instance, that dark blue is register of application. And every case starts with this one. And you can inspect the cases further. When you left click and draw a rectangle to the upper left, you zoom out to the standard view again.
And you can make several observations for instance, about the arrival rate, which is a diagonal line, so relatively constant. And also that cases are of relatively short duration. However, you can also look at this from another perspective. For instance, the x-axis you can change to not show the real time, but the time since the case was started. Then you can investigate case duration times. Well, another thing you need to do is sort the trace by the duration of the trace. And then you get the short cases on top and the longer cases at the bottom. And this is your usual shape that you get. So now that I’ve explained quickly to basics of a dotted chart.
Let’s look at a real data set. So let’s look at the BPI 2012 challenge event log and what the dotted chart gives for that data. So let’s again start the dotted chart plugin.
And the first thing we do is again, to set the color attribute to the concept name. And then we see many real nice patterns. For instance, you see wide gaps in between the colors. When we zoom in, we can see it in a bit more detail. You see real bursts of activity, purple and yellow in this case. But on particular days, there’s no activity recorded. So let me zoom out again. How can we investigate this further? Well, we could also change the x-axis to the time since the week was started. And now, what you see are days in the week. Here you see a day where mainly the blue activities were recorded.
But the other six days, you also see yellow, green, and purple activities. The leftmost day– so this, where you only see blue events, is actually a Sunday. So you see that new applications are registered to the automated website, but the real treatment of the requests is done Monday through Saturday, with slightly less activity on the Saturday. So let’s go back to the initial settings and investigate this data further. So what else do we see? We can see that the arrival rate is rather constant. How do we see that? Because there’s a straight diagonal line from the left top to the bottom right. There’s slight deviations but not much. So the arrival rate is rather constant.
What we can also see is that the closing of cases is also rather constant. So there’s another line with the greenish dots that follows the arrival process. And as I have shown on the artificial data, you can also look at the case durations. So we set the x-axis again, to the time since case start, and sort the cases on duration of the case. And then we see this.
What we see is a very large set of cases, almost half, that take very short time. So if we zoom in, for instance, at this top, then we see that these cases take only a couple of hours. Then, when we zoom out again, we see that there’s a bulk of cases that take a couple of days. So this is the 12th day marker. And here is the 27 and the 30 day marker. And you see a particular case type here around the 30 day marker. But you also see a long tail towards the end of cases that can take over 130 days. And this is the usual pattern that you see.
Some cases take extremely short, then there’s the bulk of the average, and then there’s a long tail of cases that take longer. So I hope that I’ve shown you that there are many configurations for the x and the y-axis and also the coloring that allow you to investigate the data at hand and gets initial insights and initial filtering items.
So now, I hope you have good insights in how the dotted chart can be used to gain further understanding on the data that’s contained in the event log. And as such, it can help you to further process and filter the event log for further analysis. So I hope to see you again in the next lecture, where we explain for instance, how you can filter event logs.