Hi and welcome back. In this lecture, I’ll explain you how to import and inspect event logs in ProM. So we’re still in the extraction phase since importing and inspecting is still part of checking whether the data is correct, although it also helps in the data processing phase. Before we switch to ProM I have to briefly explain several import options that the tool will provide. You have four options. They are increasing in the amount of events that they support, but they are also increasingly slower. So it’s a trade off between size and speed. So these are the four options.
In a couple of seconds I will show you the naive import option because we are dealing with a very small event log and the naive option is the fastest one. If we encounter issues or you encounter issues with other event logs or the memory is an issue, then switch to one of the subsequent input options which allow more data to be imported but are slightly slower in processing the data. So without further ado, let’s look at ProM. So with ProM open let’s import an event log and see what options you have. So the first options you already have when you import. So you can choose the option to import.
And as explained in the lecture, the naive one uses the most memory but it’s quite fast, lightweight. And sequential is also quite fast but has some minor limitations. And then disk-buffered by MapDB puts as little information in the memory as possible and therefore is the best choice when you have extremely large event logs. But for now and especially for artificial event logs, we can use the naive import option. So after import, you see the XLog event log as an object. And let’s visualize and inspect this event log. So the main visualizer for event logs is the log dialog. It consists of two charts in the middle but the key figures are in this left column.
So it indicates that there’s 1 process in the event log. That’s always the case with 100 cases or traces with 590 events recorded for these 100 cases. Then we have seven different event classes that have been observed and they cover only one event type. I’ll explain it in a bit more detail later. You can also record which resource or employee executed a particular event. In this example we don’t have that. We have only one originator. But this can help to see how users collaborate. In the central part of this view you have two graphs. You have the events per case and the event classes per case.
The top chart shows how many events per case are recorded and also their distribution. So the minimal length of a trace is five events and it’s this bar but the majority is six events. And in real life event logs you mainly see a more spread distribution with a bigger range. You also see the number of events classes per case. That actually means the number of activities roughly. In this case this is exactly the same, so that indicates that every event class is only occurring once per trace. On the right-hand side you see the first and the last observed timestamp, which gives you the time span of the event log.
On the left you see that we’re currently looking at the dashboard, but we can also inspect the event log and view a summary. So let’s inspect. Now you see all the traces that are recorded in the event log. And when you click a trace you see the list of events recorded for that trace. On the right-hand side you currently see the attributes for this current trace which is only the concept name. But when you select an event, you see that it has a concept name, the life cycle transition which indicates the state of the activity that was observed.
So the activities check credit in the completion state so it could be that check credit is at some point started and completed and therefore would result in two events. Every event also has a timestamp and you can select different events to see how attributes change. Note, however, that traces and events can have many more attributes attached, like cost or case information. But in this simple artificial event log, that’s not included. Here on the top you also have another tab, the Explorer, which is actually a different visualization of the individual traces. So here you again see the traces and you can hover over each of the widgets and you get additional information. The color indicates how frequent the activity is.
For instance, this orange activity is the accept which is less frequent because it’s either accepted or rejected. So here you can see the traces in a different view. You can also inspect the log level attributes. So far we can only inspect trace and event attributes but here you can see which attributes are recorded on the log level. For instance, which extensions or attribute notions are included and, most importantly, which attributes are global for the trace and the event level. Global attributes mean that every trace or event has these attributes set. So in this event log every trace has a concept name and every event has a concept name, its life cycle transition, and a timestamp.
And this is usually the case for all event logs. Next we have classifiers. And a classifier indicates your notion of event. And in this case it’s the activity that has been executed and the life cycle transition. So for instance start and complete over particular activity result in two different event classes. An example of a different classifier is to use the resource attribute as a classifier. Then you’re actually looking at how a case is handled with different resources. And then the next list of attributes is actually all type of attributes that have been added for the 3TU data center. We will ignore them for now.
The last step on the left is actually the summary, and it gives a bit more detailed overview of how often particular event classes occur. So on the top you again see that you have 100 traces with 590 events. And for the MXML legacy classifier, which indicates the concept name plus the life cycle transition, is a particular class. That’s how we evaluate and distinguish events. You see that there are seven classes and these are the frequencies. So register application is observed 100 times and it’s roughly 17% of all observed events. But you see the check system has been observed 90 times and reject and accept have been observed 80 and 20 times respectively.
You also see how many a particular event or event class was the starting or the ending event for a trace. In this example, all traces start with register application and end with send decision via email. So using the log dialog and the different views, you can really get an initial idea of the event log. So usually this is the first thing you do after you import an event log. You look at the log dialog and get an idea of what’s in there. Later on in this week we’ll explain further analysis techniques that you can use to get an idea of the event log content. So now you know how to import and inspect event logs in ProM.
So this can be used to evaluate whether the extraction was done successfully, whether you see any anomalies. And this is important to check soon before you start any other subsequent steps. So I hope that we see you again in one of the following lectures.