Skip main navigation

Alpha miner

In this lecture we explain how the very first process discovery algorithm, the Alpha miner, works. We show the main steps taken by the Alpha miner.
9.6
Hi. Welcome back. In this lecture, I will explain how the alpha miner works. So the alpha miner is actually the very first algorithm that bridges the gap between event logs, or observed data, and the discovery of a process model. And since it was the very first algorithm to be created, it also has its flaws, but it was a good starting point to continue on. So the alpha miner has a few main steps. And I will, in the remaining of this lecture, briefly and on a high level explain how these work. So the first step that the alpha miner does is it scans the traces for ordering relations between activities. And using the relations, it builds a footprint matrix.
54.4
In the next step, it takes this footprint matrix and converts it to a Petri net. So there are four ordering relations that the alpha miner can detect. There’s a directly follows, indicated by the greater than sign. So if A is directly followed by B in a trace, it records this using this notation. And it does so for every pair. Now, it can deduce a sequence of A sequence B, if it sees that A is sometimes directly followed by B, but never that B is directly followed by A. Hence, you can conclude that A and B are in a sequence. You always see first A and then B, and never the other way around. You can also see them in parallel.
104.2
So if you’ve seen both A being directly followed by B, as well as B being directly followed by A, you can conclude that A and B are in parallel. They are both executed, but in arbitrary order. Finally, there is the no direct relation. So if you’ve never seen A being directly followed by B or B being directly followed by A, you indicate it with this symbol. And using these for ordering relations, we can construct a footprint matrix. So let’s look at again these five traces that we’ve seen before. And the alpha miner will construct the following footprint matrix. So let’s look at it in a bit more detail.
150.1
And I’ll explain a couple of cells, and invite you to see if you can validate the other cells. So for instance, if you look at this cell, you see that A is never directly followed by A. That is, you never see A and then A. If you look at the cell A and B, you see that A is in sequence with B. So you see A, and you can see B straight after. However, you never see B followed by A. So let’s look at the data. You indeed have some observations where you see A and then B, but nowhere there’s a trace where you see B which is directly followed by A. Let’s look at another cell.
191.5
For instance, this one, B and C, how are they related? You see that B and C are in parallel. How do we see this? There are some observations where B is directly followed by C. And there’s also an observation where C is directly followed by B. And according to the rules above, this is then a parallel relation. Let’s look at one other cell. When you look at the cell C and then A, you see that they are in the reverse sequence relation. So it can be that C is preceded by A, but never that A is after C. You see that this footprint matrix is mirrored over the diagonal.
231.8
So I’ve just discussed C and then A, but in the cell A and then C, you see the normal sequence operator. And again, I invite you to validate the rest of the table. So using the footprint matrix, we can build a Petri net. So we’ve observed that you always start with A, so we create a place with a token, which is connected to A. But now, we need to decide what comes after A. We can find it in this part of the footprint matrix. So you can see that A is always followed by B, C, and D. So we can add the connections. But we also saw that B, C, and D are in parallel.
274.5
Therefore, you create this parallel split. Now, we look at another part of the footprint matrix. How is E related to B, C, and D? Well, the alpha miner observes that B is never followed by E, but D and C are. So therefore, it creates this relation. So we’d observed the sequence between D and E, and between C and E, and therefore, connected them like this. Similarly, for the activity F, it observes that B and F and D and F are observed. And hence, it connects the places like this. Now finally, activity G, it’s in sequence with both E and F. So you join the choice and you conclude with G.
325.3
So given this data, this is the Petri net that is discovered by the alpha miner. So let’s see how well this Petri net actually describes this data using our process model checklist. First, let’s check soundness. Is this process model sound? No, it’s not. Look at this state. So after you’ve executed A and B, C, and D in any order, you are in this state of the Petri net. Now, you can do two things. You can fire either E or F. So what happens if you fire F and then G? You’re in this state. The token in between C and E is remaining, and its violating one of our soundness rules, the proper completion rule.
369.6
We are in a state where the final place is marked, where there’s work left behind. So actually, this process model discovered by the alpha miner is not sound. So let’s validate the order four quality dimensions. Replay fitness. Can it replay all the observed traces? Yes, it can It cannot replay perfectly or everything that we have seen. Is the precision good? Well, it’s OK. It’s not perfect. The model allows for a bit more behavior if you correct for the unsoundness, it will allow for some more behavior, but it’s not that bad. It’s also generalizing. It’s introducing behavior that you have not directly seen, but that you can conclude from what you’ve observed. And also, it’s a rather simple model.
418.4
But the main problem here is that the process model that it discovered is not sound.
425.7
So in this lecture I’ve explained to you how to alpha miner works, and in the next lecture, I will show you how to use it in ProM and what the results are, starting with the example that I’ve just discussed. I hope to see you again in the next lecture.

In this step we explain how the very first process discovery algorithm, the Alpha miner, works.

We show the main steps taken by the Alpha miner to get from the input data to a process model. We also show the limitations of the Alpha miner.

This article is from the free online

Introduction to Process Mining with ProM

Created by
FutureLearn - Learning For Life

Our purpose is to transform access to education.

We offer a diverse selection of courses from leading universities and cultural institutions from around the world. These are delivered one step at a time, and are accessible on mobile, tablet and desktop, so you can fit learning around your life.

We believe learning should be an enjoyable, social experience, so our courses offer the opportunity to discuss what you’re learning with others as you go, helping you make fresh discoveries and form new ideas.
You can unlock new opportunities with unlimited access to hundreds of online short courses for a year by subscribing to our Unlimited package. Build your knowledge with top universities and organisations.

Learn more about how FutureLearn is transforming access to education