Skip to 0 minutes and 10 secondsHi. Welcome back. In this lecture, I will explain how the alpha miner works. So the alpha miner is actually the very first algorithm that bridges the gap between event logs, or observed data, and the discovery of a process model. And since it was the very first algorithm to be created, it also has its flaws, but it was a good starting point to continue on. So the alpha miner has a few main steps. And I will, in the remaining of this lecture, briefly and on a high level explain how these work. So the first step that the alpha miner does is it scans the traces for ordering relations between activities. And using the relations, it builds a footprint matrix.

Skip to 0 minutes and 54 secondsIn the next step, it takes this footprint matrix and converts it to a Petri net. So there are four ordering relations that the alpha miner can detect. There's a directly follows, indicated by the greater than sign. So if A is directly followed by B in a trace, it records this using this notation. And it does so for every pair. Now, it can deduce a sequence of A sequence B, if it sees that A is sometimes directly followed by B, but never that B is directly followed by A. Hence, you can conclude that A and B are in a sequence. You always see first A and then B, and never the other way around. You can also see them in parallel.

Skip to 1 minute and 44 secondsSo if you've seen both A being directly followed by B, as well as B being directly followed by A, you can conclude that A and B are in parallel. They are both executed, but in arbitrary order. Finally, there is the no direct relation. So if you've never seen A being directly followed by B or B being directly followed by A, you indicate it with this symbol. And using these for ordering relations, we can construct a footprint matrix. So let's look at again these five traces that we've seen before. And the alpha miner will construct the following footprint matrix. So let's look at it in a bit more detail.

Skip to 2 minutes and 30 secondsAnd I'll explain a couple of cells, and invite you to see if you can validate the other cells. So for instance, if you look at this cell, you see that A is never directly followed by A. That is, you never see A and then A. If you look at the cell A and B, you see that A is in sequence with B. So you see A, and you can see B straight after. However, you never see B followed by A. So let's look at the data. You indeed have some observations where you see A and then B, but nowhere there's a trace where you see B which is directly followed by A. Let's look at another cell.

Skip to 3 minutes and 12 secondsFor instance, this one, B and C, how are they related? You see that B and C are in parallel. How do we see this? There are some observations where B is directly followed by C. And there's also an observation where C is directly followed by B. And according to the rules above, this is then a parallel relation. Let's look at one other cell. When you look at the cell C and then A, you see that they are in the reverse sequence relation. So it can be that C is preceded by A, but never that A is after C. You see that this footprint matrix is mirrored over the diagonal.

Skip to 3 minutes and 52 secondsSo I've just discussed C and then A, but in the cell A and then C, you see the normal sequence operator. And again, I invite you to validate the rest of the table. So using the footprint matrix, we can build a Petri net. So we've observed that you always start with A, so we create a place with a token, which is connected to A. But now, we need to decide what comes after A. We can find it in this part of the footprint matrix. So you can see that A is always followed by B, C, and D. So we can add the connections. But we also saw that B, C, and D are in parallel.

Skip to 4 minutes and 35 secondsTherefore, you create this parallel split. Now, we look at another part of the footprint matrix. How is E related to B, C, and D? Well, the alpha miner observes that B is never followed by E, but D and C are. So therefore, it creates this relation. So we'd observed the sequence between D and E, and between C and E, and therefore, connected them like this. Similarly, for the activity F, it observes that B and F and D and F are observed. And hence, it connects the places like this. Now finally, activity G, it's in sequence with both E and F. So you join the choice and you conclude with G.

Skip to 5 minutes and 25 secondsSo given this data, this is the Petri net that is discovered by the alpha miner. So let's see how well this Petri net actually describes this data using our process model checklist. First, let's check soundness. Is this process model sound? No, it's not. Look at this state. So after you've executed A and B, C, and D in any order, you are in this state of the Petri net. Now, you can do two things. You can fire either E or F. So what happens if you fire F and then G? You're in this state. The token in between C and E is remaining, and its violating one of our soundness rules, the proper completion rule.

Skip to 6 minutes and 10 secondsWe are in a state where the final place is marked, where there's work left behind. So actually, this process model discovered by the alpha miner is not sound. So let's validate the order four quality dimensions. Replay fitness. Can it replay all the observed traces? Yes, it can It cannot replay perfectly or everything that we have seen. Is the precision good? Well, it's OK. It's not perfect. The model allows for a bit more behavior if you correct for the unsoundness, it will allow for some more behavior, but it's not that bad. It's also generalizing. It's introducing behavior that you have not directly seen, but that you can conclude from what you've observed. And also, it's a rather simple model.

Skip to 6 minutes and 58 secondsBut the main problem here is that the process model that it discovered is not sound.

Skip to 7 minutes and 6 secondsSo in this lecture I've explained to you how to alpha miner works, and in the next lecture, I will show you how to use it in ProM and what the results are, starting with the example that I've just discussed. I hope to see you again in the next lecture.

# Alpha miner

In this step we explain how the very first process discovery algorithm, the Alpha miner, works.

We show the main steps taken by the Alpha miner to get from the input data to a process model. We also show the limitations of the Alpha miner.

© Eindhoven University of Technology