Contact FutureLearn for Support
Skip main navigation
We use cookies to give you a better experience, if that’s ok you can close this message and carry on browsing. For more info read our cookies policy.
We use cookies to give you a better experience. Carry on browsing if you're happy with this, or read our cookies policy for more information.

Skip to 0 minutes and 10 secondsHi, and welcome back. In this lecture, I would like to show you a particular case study on real data. And in the next article there are also links to other the case studies that you can look to for inspiration. So one particular nice data source for public data is the BPI Challenge. So this is a yearly open challenge since 2011. And the data and also the submitted reports are public. So the challenge is, data is published, there are some questions, people submit their report, the best one wins. But all the reports are made public. And the links to all the BPI Challenges but also some other sources are provided in the article.

Skip to 0 minutes and 50 secondsAnd in this lecture I would like to focus on the BPI Challenge of 2012. And it's actually the real life event log that I showed in all the previous lectures. And this data comes from a Dutch Financial institute providing consumer loans. So consumers can ask the institute, I want to loan this amount of money. And then the institute says yes, or no, or we need more information. The event log contains 260,000 events, spread over 13,000 cases. And it contains data attributes such as the amount requested which makes it interesting for data analysis purposes. The submission or the process starts at the web page followed by some automated checks. And the activities are divided in three types of states.

Skip to 1 minute and 42 secondsSo events starting with A denote states of the application. Events starting with O denote states of the offer belonging to the application. And W are work items belonging to belong to the application. And A and O are rather structure, but W is a bit unstructured because it's mainly manual activities. And A and O are particular states that follow a clear procedure. So I would like to go through three submissions for the BPI Challenge 2012. So in this submission, the applications have been classified. On the top note all the applications are present and then a distinguishment has been made between approved, declined, cancelled, or undecided, offer, or no offer and fraud, or no fraud detected.

Skip to 2 minutes and 30 secondsSo this already gives a categorisation of classification over all the 13,000 applications. Another analysis that they did was the number of resources active on a particular day. So the red dots on a number of resources active and you see that it's usually is between 20 and 30. Except on Saturdays that's the green line, and on Sundays even less resources are active. They also looked at this particular resource. And when they start and then working days. So for instance this resource 10,138

Skip to 3 minutes and 3 secondsyou can see that they usually start between 8:00 and 9:00

Skip to 3 minutes and 6 secondsin the morning and then finish around 4:00. Or they have a late shift where they start afternoon and they finished around 8 o'clock in the evening. So you see another plot of the start and end times of another resource. But here you see that this resource is working shorter hours.

Skip to 3 minutes and 23 secondsSo for instance this resource usually work from 5:00 to 9:00.

Skip to 3 minutes and 29 secondsIn this report there are also main observations made. For instance, that there's an automated resource, resource number 112 which is involved in approval of 3 loan applications. And since this is an ultimate resource, this is suspicious and should be investigated further. Also, in 2 cases a customers called after the application was already cancelled. So this could have been worked or can be prevented. And in 74 cases the completion checks are performed after the application is already accepted. So this might be something that the process owner might want to investigate further and see whether this is an issue and how it then can be prevented. There are also some data ambiguities discovered which are usually when you're analyzing your data.

Skip to 4 minutes and 21 secondsBut it's important to know this because they might be fixed. And then when you get a new data sets in sometime you might be able to do better analysis.

Skip to 4 minutes and 32 secondsThen in another report this diagram is proposed. And what they do here is over the runtime off the case, so the number of days after that the application was received. They plot how many communities differently applications were approved, cancelled, or declined. And what you see is offer around 30 days suddenly a lot of applications are cancelled. And this seems to be an automated activity. However, it also increases the work time of the person-day spent significantly. However, they also see that the offer run 20 days after receiving application, majority of cases was already approved. So the recommendation in this report was to see whether this term of 30 days could be moved a bit earlier to prevent waste off manpower.

Skip to 5 minutes and 22 secondsBut in another report they analyze when the cancelled activity was executed. And you see that after receiving the application cancellation is done almost always. But there's a particular line visible which is actually again the 30 days. But now analyzed from the Dotted chart view. Another Dotted chart that they made was plotting the system activities. And then you clearly see the system activity so usually 112 is mainly executing the first activities. So submission through the web form and the first checks, but also the ultimate canceling and sometimes activities after this period. So this shows that particular activities are done by the system resource.

Skip to 6 minutes and 7 secondsAs I mentioned more cases are available. So for the BPI challenge 2012 there were several submissions and you can look at them. So since you know the data, it might be interesting to look at the other submissions. But the article that follows also contains links to other case studies. For instance, the IEEE task force website also lists several case studies where process mining has been successfully applied within the company on real data. So I hope this inspires you for new types of analysis and shows you the value of process mining in practice. I hope to you see again in the next lecture of this week.

Process mining applications

In this step we discuss several results from the BPI Challenge (Business Process Intelligence Challenge) 2012 data, as found in some of the submitted reports.

By looking at this case study we can start to see the power and value of process mining techniques to businesses and organisations.

Share this video:

This video is from the free online course:

Introduction to Process Mining with ProM

Eindhoven University of Technology

Course highlights Get a taste of this course before you join:

  • Introduction

    Introduction to process mining: recognizing event data, what is process mining and what can process mining analyse.

  • Installing ProM lite
    Installing ProM lite

    In this step we show how to find and install the free and open source process mining tool ProM lite.

  • Using ProM lite
    Using ProM lite

    In this lecture we show the basic concepts and usage of ProM (lite): the resource, action and visualization perspectives.

  • Event logs
    Event logs

    In this lecture we explain what an event log is and how it is structured. We also explain the most common attributes found in an XES event log.

  • Event logs in ProM
    Event logs in ProM

    In this lecture we show you how you can load an event log in ProM and how you can get initial insights in the contents.

  • Converting a CSV file to an event log
    Converting a CSV file to an event log

    Most data is not recorded in event log format. In this video we explain how a CSV file can be converted to an event log.

  • Exploring event logs with the dotted chart
    Exploring event logs with the dotted chart

    After loading an event log into ProM it is important to apply the dotted chart to get initial process insights before process models are discovered.

  • Filtering event logs
    Filtering event logs

    Before good quality process models can be discovered the event log data needs to be filtered to contain only completed cases for instance.