We use cookies to give you a better experience. Carry on browsing if you're happy with this, or read our cookies policy for more information.

Skip main navigation

Converting a CSV file to an event log

Most data is not recorded in event log format. In this video we explain how a CSV file can be converted to an event log.
9.5
Hi, and welcome back in this lecture, in this lecture, I’ll show you how to convert a CSV file or flat tabular file, to an event log. So we’re still in the extraction phase. Since event logs are usually not ready made, you have to convert, for instance, CSV files, or other files, to an event log. And the easiest way to do this is through a CSV file. So in a previous lecture, I showed you this table of the public transport event data. And I explained that there are several columns in such a table. So you have a trace column, event name column, a timestamp column, and additional attributes.
49.3
So it’s important beforehand that you recognize the trace, event name, and the time stamp columns in your own dataset because that we will have to indicate later. So in a second, I will open ProM, and I will load this CSV file, the artificial insurance process file. And these are the first couple of rows that are present in that file. So as you see, it is simpler than the previous example. It contains only three mandatory columns, the case, or the trace ID, the event name, and the time stamp. So for instance, here, you see several events. And for trace 0, there are six events on different time stamps in 2013.
91
So let’s look at ProM and see how we can convert a CSV file to an event log. So with ProM open, let’s import the CSV file. So let’s select a loan process CSV file and press Open. And we press the Action button, there’s only one plugin that’s available, which, actually, is the conversion to the XES event log. When we click this plugin, the first wizard screen shows you the data that will be imported, or at least the first couple of rows, and also how the file should be interpreted. So how is the comma or the column separator, and how are quotes indicated. All these are automatically detected, but sometimes it might go wrong.
137.6
So, for instance, if the column separator is not clearly detected, you get one column with all the data. So let’s put this back and press Next to go to the next configuration screen. So here, we see four areas. In the top left, we can specify the case column. So each row is a particular event occurring. And we have to be able to relate it to the case. So in this example, the case column indicates to which case it belongs. However, we can also remove this and, in this dropdown, you see all the columns in the data set. And when we select it again and press the plus sign, we add this.
177.9
And if you add multiple columns, the unique combinations of each is, in this case, the case column or, here, the activity name. You can also tell the algorithm how to come up with the activity name for each event. And in this example, it’s the event column. Then, on the bottom, we have the start and the completion times of this event. Usually, you only have completion times. So let’s fill this in. And that’s actually in the complete time column. And it automatically detects what the time stamp format is. So usually, this is correct. But if you see it’s not correct, then you can adjust the formatting. There’s also an expert configuration. But, in this course, we won’t cover that.
222.8
Then, in the final screen we have four more options. So the first one is how errors are handled. So you could choose to omit the trace, the event or the attribute from the event log and go to the next object to pass. However, I’ve always preferred to stop on error, I can inspect the error, and I can do the conversion again to fix the error. If a trace event or attribute is dropped, you often don’t notice it, and you’re missing data. The other three options are usually OK as they are. But you might want to inspect them if you encounter any issues. If I press Finish, that CSV file is converted to XES.
265.3
And it’s shown in the XES event log dialog. And here you can inspect all the traces and events that you just imported. So this shows you how you can import a single table CSV file, where each row indicates or records an event. How you can import it into ProM. So now you know how you can get an event log out of a flat CSV file using ProM Lite. So this concludes the extraction phase of this course. And in the next lectures, we will look at processing and analysis techniques. Hope to see you again soon.
Most data is not recorded in event log format. In this video we explain how a CSV file can be converted to an event log.

Updates

  • In the video, the file “Artificial – Insurance Process.csv” is mentioned. This should be “Artificial – Loan Process.csv”.
This article is from the free online

Introduction to Process Mining with ProM

Created by
FutureLearn - Learning For Life

Our purpose is to transform access to education.

We offer a diverse selection of courses from leading universities and cultural institutions from around the world. These are delivered one step at a time, and are accessible on mobile, tablet and desktop, so you can fit learning around your life.

We believe learning should be an enjoyable, social experience, so our courses offer the opportunity to discuss what you’re learning with others as you go, helping you make fresh discoveries and form new ideas.
You can unlock new opportunities with unlimited access to hundreds of online short courses for a year by subscribing to our Unlimited package. Build your knowledge with top universities and organisations.

Learn more about how FutureLearn is transforming access to education