## Want to keep learning?

This content is taken from the Eindhoven University of Technology's online course, Introduction to Process Mining with ProM. Join the course to learn more.
2.19

## Eindhoven University of Technology

Skip to 0 minutes and 10 seconds Hi. Welcome back. In this lecture, I will explain you the inductive miner, which is again, an improvement on the alpha and heuristics miner. And we’re still trying to bridge this gap between the data and the discovery of a hopefully sound process model. So what are the characteristics of the inductive miner? Well first of all, it guarantees to produce a sound process model. Then how it works, it finds first a prominent split in an event log, then detects how these splits are related. And then continues on both splitted logs. And I’ll explain both points in more detail, starting with soundness. So the inductive miner internally does not work on Petri nets. It actually uses process trees.

Skip to 0 minutes and 58 seconds And from this Petri net I’ll show you how the process tree is constructed. What can we observe in this Petri net. We see that first activity a is executed, then three branches in parallel, then a choice between e and f, followed by g. We can encode this using the sequence operator, which indicates that you first do a, then a block, then another block, and then g. So how do we encode this first block? So we introduce the parallel operator and activities b, c, and d are in parallel. And a parallel operator or symbol is the a without a horizontal bar. Then the next block is activities e and f in a choice. So you exclude either e or f.

Skip to 1 minute and 43 seconds We used the x operator for exclusive choice. You exclusively choose for either e or f. So the process tree shown on the top actually describes the same behavior as the Petri net on the bottom. However, whatever you do in a process tree, the tree always represents a sound process model. So how does the inductive miner get to a process tree? Well, it repeatedly splits the event log. So again, this is our example event log. And it finds the most likely split. In this example for instance, between a and the subsequent activities. Now it analyzes both sublogs. Well, the sublog on the left side of the bar is easy.

Skip to 2 minutes and 33 seconds It only contains a’s, so we can say we do always a followed by something else. Well, what is this something else? That’s this part of the event log. What’s the most prominent split we can make? Well, between g. We do something and then we do g. So we can add g to the sequence operator. And now we have to analyze what’s happening in between, what’s happening in between a and g. Well, another prominent split we can make is this one. There we have two sublogs again. And in one sublog we can see that every trace contains either e or f. Well, we can encode this as such, and remove it from the sublog.

Skip to 3 minutes and 18 seconds Now we have to analyze these traces with length 3. Well, what do we observe? Every trace contains activities b, c, and d, but in any order. Well, this is parallelism. Hence, we introduce the parallel operator between these three activities. This, in a slightly simplified way, is how the inductive miner works. So this process tree, I hope you can see that can be translated to a Petri net. And that’s what the inductive miner presents to you in ProM. So let’s analyze this Petri net that the inductive miner will discover, based on this input data. Again, let’s take our checklist. Well, and as I already explained, and I hope you can verify, this process model is sound. So check.

Skip to 4 minutes and 4 seconds Can we replay all the data that we put in? In this example, yes. So every trace that we put in that we learned the model from, can be replayed in this Petri net. However sometimes the inductive miner makes a decision where the input data cannot be replayed anymore, especially on real life data, this is actually useful. But in this example, replay fitness is perfect. The precision of this process model is also quite OK. It allows for a bit more behavior, but not too much. Similarly, generalization is OK since it correctly derived that particular behavior is possible, although not directly observed. And simplicity is also good since it’s a nicely structured model that’s easy to read, especially from left to right.

Skip to 4 minutes and 46 seconds So in this lecture, I’ve shown you the inductive miner, which is the third algorithm we discussed that is able to do process discovery. In the next lecture, I will show you the inductive miner in ProM, and how it performs on real life data. And then we’re almost ending week 2, aiming at process discovery. So I hope to see you in the next lectures.

# Inductive miner

In this step we explain the basics of the inductive miner.