Skip main navigation

Performance analysis in ProM

In this lecture we show several ways how ProM plug-ins can be use to analyse the performance of a process.
Hi, and welcome to this lecture on performance analysis in ProM. So in the last lecture, we covered the alignments where we related the event log data to the process models. And we can use this to enhance the process model. So as I mentioned, you know how to align a trace within a process model. And you get the alignments shown above. And I hope you realize that sometimes a mismatch between the trace and a model exists. And alignments fix that. Using these alignments, we can also project timing information on the process model. So let’s look at the trace where we show the timing information. For instance, here you have a trace that starts a time stamp 0.
So the observation of activity A is when the timer starts. Then activity B is observed at time stamp 5, et cetera, et cetera, until activity G has been observed at time stamp 20. So we know exactly relatively within the trace when certain activities have been observed. And usually you only know the completion times. So when was activity B completed? You usually don’t have timing information on when it has started. Then, of course, we can add two more traces with different timing information. A starts a timer for the particular trace, F raises the middle trace, ends at the time stamp 14, and the last trace at time stamp 30.
Using the alignments, we know exactly how to relate these events to the process model, and we can annotate activities with the times at times it has been observed. So activity A, for instance, is always observed at time stamp 0. But activity B has been observed at timestamps 5, 9, and 16 after the start of the trace. And similarly, we can annotate all the other transitions and activities with this timing information. So using this information, we can also annotate the places. So for instance, the place in between A and B, we know that cases spent 5, 9, or 16 time units waiting until activity B fires, and similarly between A and C, and A and D, for instance.
Also, the place before E or F we can annotate. I also know that E has only been observed two times and F once, so different frequencies of observations can exist. However, this place is only occupied when both transitions, B, C, and D have fired. So you have to take the maximum time of when these are fired. And then you look at the time at which E or F has fired. So a token or a case typically spends here three or four time units until E or F is fired. And I hope you can see that we can annotate also the other places. Similarly, we can get information on how much time tokens spend in places.
Note that if you also have the starting time, we can distinguish between waiting time and execution time of an activity. But for now, since we only have completion times, this is the best we can do. So let’s switch to ProM and show you how you can get these results and project them on a process model. So with ProM open, let’s import the artificial loan example event log, and let’s discover a Petri net using the Inductive Miner with standard settings. And using this process model, let’s use this to calculate alignments again. So with this process model, select Loan Example Event Log, and now select this plug-in. Replay A Log On Petri Net For Performance/Conformance Analysis.
And we get slightly different dialogs. So here you can specify patterns. But let’s skip that for now. You can also say that the mapping option for the string activity names is OK. And here, you can verify that all the mapping between the process model and event log has been done correctly. Here, again, you can specify the costs for deviations.
And the time stamp attribute usually contains the time of the execution of the activity. So that’s OK. Here, it’s important to say no because otherwise, you get very few results because it has to ignore many traces. So when we zoom in into this process model, you see that transitions now have an orange or red color. And the darker red means more time spent in that transition. So if we look at the global statistics, we see that on average, the case spends 25 minutes in the process with a minimum of 7 and a maximum of 41 minutes. And of course, you want to know where time is spent.
Well, here, of the 24 minutes on average, 12 minutes are spent here and 11 here, which since these are in parallel, this is not sequential. But cases are waiting for either of these three to be executed. But these two take a while before they are executed. And if you open the element statistics, you get more details so you know that on average, the time is 12 minutes, maximum 27 minutes, and the fastest execution observed was two minutes. And this helps you indicate where the bottlenecks are in the process. Another way to do this is to use the inductive visual miner. So with the event log alone, we apply the inductive visual miner, and it quickly finds a model.
But now instead of show paths, we say show paths and, for instance, sojourn time. And then we see indeed that calculate capacity has an average duration of 12 minutes. And in the bottom right, when you move over a transition, on the bottom right, you get a graph of the distribution and some other statistics. And you can inspect all the transitions and see how long they take. So let’s do the first option again on the BPI 2012 Challenge Event Log. So let’s select this one. with only the A and O process. Let’s import the event log. And again, let’s mine the process model using inductive miner.
And this process model we again, align on the data. Be careful to select the correct event log. And then again, select the Replay Log on Petri net for performance and conformance analysis. All the matching is done correctly. Always verify, since these match very well. The mapping is done correctly. Also, the cost, we keep one for now. It’s currently aligning the model and the data. Again, the time timestamp attribute contains the time. And here we click No. And now we see how a real process is executed. And where the bottlenecks are. So here we can move to the right. And we see that actually, this is the key bottleneck.
And if we open the element in the global statistic and move them in view, we see that this activity cancellation takes on average 20 days while the case throughput time on average is 7 days. So that might be interesting to investigate further since this is a rather high deviation. And actually, the execution time of this activity is longer than the average run time. So it’s likely that this activity is not executed for each trace. And also you can also click on places and other activities to see how long they typically take. And you also know that all the black transitions in the previous view don’t have any timing because they are not observable activities.
So using these two plugins, you can evaluate how long a case on average takes and where most time is spent in the process. So now you’ve seen how you can project timing information onto a process model. And you also know how you can get several other time related statistics for particular activities. So this is one example of an extension technique that combines the event log data with the process model. In the next lecture, I will show you another technique focusing on the social network aspect. I hope to see you again in the next lectures

In this step we show several ways that ProM plug-ins can be use to analyse the performance of a process.

In this video we use the ‘Artificial – Loan Process.xes.gz’ event log first, and then the ‘BPI_Challenge_2012_AO.xes.gz’ event log.

This article is from the free online

Introduction to Process Mining with ProM

Created by
FutureLearn - Learning For Life

Reach your personal and professional goals

Unlock access to hundreds of expert online courses and degrees from top universities and educators to gain accredited qualifications and professional CV-building certificates.

Join over 18 million learners to launch, switch or build upon your career, all at your own pace, across a wide range of topic areas.

Start Learning now