Joos Buijs

Joos Buijs

Assistant professor at Eindhoven University of Technology at the Data Science Center (DSC/e) and research group of prof. dr. ir. Wil van der Aalst on process mining.

Location Eindhoven, the Netherlands

Activity

  • Hi Hiafeng, great to read. This will be covered in week 3 :)

  • Hi Juan,

    I fully agree: data preparation is key and influence the quality of your results in the subsequent phases. It also requires an iterative process, where you have to try out your analysis and go back to your filtering.

  • Hi Cameron, I believe I explain this in the 'event logs' lecture (https://www.futurelearn.com/courses/process-mining/8/steps/239656).

    Is there something missing?

  • Hi Ivan,

    Great to hear, more details can be found in the scientific articles, for instance at http://wwwis.win.tue.nl/~wvdaalst/publications/publications.html

  • Hi Marc,

    Great to read. Are you aware of the other process mining course here on FutureLearn, specifically oriented at healthcare?
    Please check https://www.futurelearn.com/courses/process-mining-healthcare/3/

    Happy mining!

  • Hi Andreas,

    Great to read you plan to continue and even start internships on process mining!

    There are two sources of information: the scientific papers accompanying most plug-ins, and some documentation material availabla at https://svn.win.tue.nl/trac/prom/browser/Documentation/
    Both are not perfect, therefore in the case where you cannot find it,...

  • Hi Marc,

    In general this should not be an issue for the course.
    The message is also slightly incorrect, as ProM lite does not come with a package manager. In general you can do two things: right-click the ProM lite icon and select 'Run as administrator'. Alternatively you can go to the folder where ProM lite is installed (investigate the properties of the...

  • Hi Ton,

    In short: no, the tool needs to be downloaded. But it is available for all platforms and can be un-installed without problems.

    Slightly longer answer:
    A couple of years back we experimented with a cloud version of ProM but there were many issues. For example, some calculations are quite expensive, hence requiring a powerful server to be provided...

  • Hi Juan,

    Good question!

    In general the hospital and IT are not that keen on changing the information system for such an analysis (especially if it is an early pilot/case study). However, usually there is enough data available to start working from. From there on you can show what process mining and data science can do, and you can pinpoint where data...

  • Hi everyone!

    Thanks for your nice introductions. It seems we have a mixed crowd, but everyone seems equally interested. I hope you enjoy the course. I'll try to monitor the comments regularly.

    Happy mining and learning!!!

    - Joos

  • Hi Roheet, great to read you managed to finish the course.

    I believe there are many possible PhD topics in this domain. I think this course, but also related literature and research publications give you enough input to find an interesting challenge.
    Tip: try to find something where you feel you can contribute and which you find interesting to...

  • Great to read Ray!

  • Hi Ray,

    Thanks for your comment. And I understand it is much information.
    Please note that we also run another process mining course which explains how ProM can be used for process mining. It explains the concepts in a bit more detail. This might help.

  • Good to read Nancy!

  • Hi Yang Qiu,

    Good comment, and I agree. Doing a 'full fledged end-to-end' conformance check might be too much. However, you can custom build your 'rules' or even small Petri nets to check (e.g. how 3 to 5 activities relate).

  • Sounds like a perfect application of process mining Andreas!

  • Hi Mark,

    Great to hear.
    And I agree, although I tried to explain how to read the results (what is a Petri net, what do you see in the conformance checking results). However, as with car driving lessons: the real learning starts once you have obtained the license and really start driving!

    Great to hear you're planning to apply this in healthcare. We have...

  • Hi Jerry: I hope you found the event logs in the mean time in the next step :)

    Hi Andreas: you're welcome! Happy mining

  • Hi Jan Willem,

    Thanks for your comment.

    The two big frames in the log visualizer show you how 'large' the trace in the event log are (number of events).

    We do not show a petri net as there are many possible ways to obtain this (Inductive miner, heuristics miner, ILP miner, genetic miner, Alpha miner, etc. etc.). We therefore rely on the other plug-ins...

  • Hi Jan Willem,

    Good question. This is a recurring challenge, also for us: how to extract the data (and which data, where to find it, how to transform it, etc.). It is therefore an open challenge, and also one without a single answer.

    I agree that compliance checking is one of the key features of process mining.

  • Hi Michael,

    Sorry for the late reply, but you ask a good question!

    'de-facto' means in reality, and thus, learned from the data.
    Process models are usually represented by Petri nets, sometimes also heuristic nets, Fuzzy models, process trees or BPMN models.
    Data rules can be learned as well, for instance decision trees.

    Hope this helps!

  • Hi Hans,

    Sorry for the late reply.

    The 'Handover of work' social network plug-in usually provides a nice result. Are you sure your resource names are stored in the 'org:resource' event attribute? The plug-in is kind of picky about this.

  • Hi Cristiano,

    Sorry for the late reply, but hope this is useful still.
    Unfortunately there is no option to define trace attributes.
    You can however add your 'trace' attribute to each event (hence duplicate it). This allows you to filter the traces etc. easily.

    Hope this answers your question!

  • Hi Michael, thanks for your comment, I have updated the link.

  • Great to read Karl!

  • Great to read. You might also consider modelling the expected process as a Petri net and then replay the data on top of it (wait until you encounter the conformance checking lecture).

  • Great to read Karl!

    Keep me/us updated on your progress.

  • Let's keep it non-political and non-cultural, research is usually a good topic to disagree on :) Although I'm also content have no disagreements :D

  • Thank you Giacomo!

    We are indeed aware of the 'less than ideal' user interface of ProM, but this is not high on our list of priorities. (we prefer to work on new techniques)

  • Great to hear that it works, and please let me know if I (or your fellow students) can help with something else.

  • Looking forward to your insights Giacomo!

  • Hi Giacomo,

    Good point, two answers:
    First of all, you do not always need the very specific notion of a 'trace', as long as you know what a trace is. If you have a table with order, patients, user sessions, or something else, this represents your trace, but does not necessarily need to be labelled as such (similarly for events).

    The second answer is...

  • Hi Giacomo,

    Nice to read you don't agree :)

    I recalled an article on the topic:
    https://fluxicon.com/blog/2011/02/how-process-mining-compares-to-data-mining/

    I think another way to view process mining is that it builds on/uses/extends data mining with the process model notion/view.
    I think this explanation satisfies us both (and otherwise we can...

  • Hi Kumaresan,

    Please click on the link in the description of the video above to download your demo data files.

  • Hi Michael,

    The Dotted chart visualization is the same as running the plug-in.

    However, I still have the plug-in in ProM lite 1.2, so maybe you have some input/output filter, or text filter set? Try clicking 'Reset' once or twice and make sure the 5 circles are grey and the text field next to them is empty.

  • Hi Hans,

    No hablo Espaniol :)

  • Hi Ivor,

    Great to hear that you like the inductive miner and its animation features.

    I'm not exactly sure what you meant with the 'ping one token through'. Do you mean you want to move one token/case a step at the time? You could do this partly by slowing down the animation.

    You also have options to change the colors of tokens based on data attributes...

  • Hi Ivor,

    Great to read that you got it working.
    It depends on what OS you are where ProM puts the packages, but they are usually somewhere in your userfolder, or in the folder where ProM is installed.

  • Recently the heuristics miner has been replaced by the 'Interactive data aware Heuristic miner', which is slightly more complex/feature rich.

    The results of this miner differ from the original heuristics miner, so feel free to skip the quiz for the heuristics miner.

    I have added a notice to the video and will think how to solve this for the next run.

  • Hi Ivor and Alan,

    Thanks for noting, I have updated the video description of step 1.5 to mention the file name.

  • Hi Yewande,

    Thanks for posting also the solution. I only saw this post after reading your other post, but glad to read that it worked out.

    It seems you took the right approach.

  • Hi Hugo,

    Please send an e-mail to the address mentioned at the forum. I need to enter your e-mail address which I cannot get via FutureLearn. Sorry for the extra effort.

  • Hi Yewande,

    Thank you for your question, and apologies for the confusion.

    However, both csvA and csvB can be read by ProM.
    Did you try copy-ing the table of csvA from the sheets to Excel, save it as a csv file and import it in ProM? Of course there are only few rows/cases/events, but the import should succeed.

    Let me know if you encounter any issue!

  • Hi Hugo,

    As far as I know you cannot filter already in the import wizard.
    What you could do however is in the last step, the top left dropdown, set it to skip event on error (or similar). This will ignore the event if there is a null value.

    Hope this helps!

  • Hi Giacomo,

    Good observation.

    However, data mining and process mining are two different disciplines, with some overlap, but none is included in the other.

    Data mining involves analysing data to find clusters/groups, rules, relations, etc.
    Process mining also uses data, but to specifically analyse processes. The challenges here are quite...

  • Thanks!

  • Great to hear! Feel free to get in touch directly if you have any further questions.

  • Hi Tahar,

    You might want to check you the other process mining course here on FutureLearn, starts in a week, and shows some other examples of applications of process mining.

  • Hi Yang,

    Good point, and indeed not a trivial one.

    I see two approaches/solutions to the issue you raise:
    1. just start and show them some results. They might/will be wrong and will trigger the clinicians to react. However, you will also show them some things they didn't know, some new insights.
    2. Ask them the right questions, that you can answer...

  • Hi Rene,

    Creating process visualizations is indeed a strong point of process mining, and is also present in most commercial tools.

    Although I agree that some conformance checks can be performed without process mining, I believe that process mining allows you to verify the whole case flow to the process, instead 'rule by rule' as you would do with other...

  • Great to hear Tigran :)

    accociation rule learning can be very valuable, key here is to extract the correct process features to be used. E.g. activity count per trace (A=1, B=2, etc.), direct succession counts (A->B=1,B->A=0) but this explodes quickly. Trace duration, resources involved etc. can all be used.
    Two papers & ProM plug-ins that might be of...

  • Hi John. I plan for multiple sessions per year, but the previous session closes some weeks before the other one opens. Since the next run of this course is Nov 13 it might very well be that this course closes soon. (unfortunately I cannot see or control this directly)

  • I would be open to this, but don't have any concrete plans to travel towards London. I do know some other LA people in London / UK so if you e-mail may I can connect you if you want.

  • You're welcome! This course will remain open/accessible for some weeks so feel free to revisit.

  • Hi Jacques,

    Thanks for updating your post with the solution.
    We are aware of the issue that downloading packages from outside our university could take a while and we are working on finding the cause and hopefully also a solution.

  • Hi Tigran,

    ProM is not (yet) so smart to see that 7 should come before 10. For most results this does not matter. For instance, in the dotted chart you can sort traces on their first event.

    Where is this unnatural sorting bothering you?

  • Great to hear Connor, hope you like the course.

  • Hi Karoline,

    Interesting read, thanks for the link! And of course also thanks for the compliments.

    Although the pattern inspector is useful, it shows individual execution sequences and not the higher level patterns like "if activity A is followed by B then ...". We are currently doing research into this direction so feel free to contact me if you are...

  • Hi Noel,

    I see your point.
    I feel that the more traditionally oriented organisations use it mainly as auditing tool, and possibly to improve processes.
    More innovative businesses start from the process improvement aspect.

  • Hi David,

    We are doing research in exactly this: relating process characteristics (in your example the delay between diagnosis and treatment) and results (survival).

    If you're interested to know more about our tool, please contact me via e-mail (find-able through a web search)

  • Hi Runumi,

    Could you please clarify? How would you like us to help you? It will be hard to really help you with the analysis remotely, but I hope this course helps you to do it yourself!

  • Hi Runumi,

    We have a large collection of artificial and real life event logs available at
    https://data.4tu.nl/repository/collection:event_logs

    There are several health related event logs, for...

  • Hi Carolina,

    You might be able to connect to Jorge Munoz and Marcos Sepulveda (both present one of the case studies ahead) for more 'local' support :)

  • Hi Angel,

    I totally understand but this out of my control.
    Please report this in the post-course survey such that your voice is heard by FutureLearn!

  • Hi Angel,

    Great to read that you are collaborating well with the clinicians, keep up the good work!

  • Hi Angel,

    In my opinion, you can help the professionals by automating the 'boring' parts of their work. See how you can help them become more efficient.

    And, as Noel mentions, you should prevent the 'me versus them' feeling, find a common goal, a win-win situation.
    You will be amazed when you start analysing data and show the results to the...

  • Hi Noel,

    Modifying the program is usually not necessary, just finding the right place where to extract the data from is key.
    Collecting data manually is also more error prone than embedding the data collection in the IT systems used in the daily practice. (I've seen this before)

  • Hi Rene,

    I'm afraid that in healthcare there are actually too many standards, hence there is no standard. Reducing the number of standards would help.
    Your proposal of governments investing in global health systems could work, but on the other hand governments are not as flexible and innovative, and are usually not good in managing (IT) projects. I feel...

  • Hi John,

    Welcome to the course!
    I'll drop by every now and then to check the new comments.

    Regarding learning analytics: this is also one of my research topics, and I know there are some paper (even a literature overview) on process mining in learning analytics / education mining.

  • I highly recommend ProM lite 1.2 as your results might differ if you use ProM 6.7 (or 6.6, 6.5, ...).

  • Hi Victoria,

    Sorry to hear you had a 'Friday the 13th experience' with ProM.
    I have forwarded your post to our local Mac OS expert, hopefully he has a solution. I'll get back to you.

  • Just wanted to drop in to say welcome to all that join this course! Welcome and if you have any questions, please post a comment!

  • Hi Samuel,

    Good question.

    By default the dotted chart shows the traces (horizontal lines of dots) as they are found in the file, which is usually not sorted.
    With the option you describe you sort the traces such that the trace with the first observed event comes first, i.e., you order the traces on when they start. This usually allows you to analyse...

  • Hi Meriem,

    Welcome to the course.

    The main difference is that data mining looks at 'any tabular data' and aims to find relationships between different records (very, very short and generalizing description of data mining!)

    Process mining specifically focuses on processes and analyses the data traces left by cases going through the process. It is...

  • Maybe in the Coursera course on process mining, by Wil van der Aalst, you will find more details and additional techniques to use.

  • Usually the best way is to extract a 'CSV' file with at least a column with the case identifier, activity name and timestamp, and then each row is an event. See the lecture on event logs.

  • Hi Jeff,

    No worries, there are still people around :)

    This is indeed a challenge in healthcare: standardize processes/protocols while still ensuring that the majority of patients fall within the process/protocol.
    I don't currently see that the role of physicians is being "dummed down", they still have flexibility, and require up-to-date knowledge to...

  • Welcome Mostafa, I hope this course helps in upgrading the department. Keep us updated.

  • Hi Rene,

    This nicely fits with some of the research that we do as a group, which we labelled 'customer journey'. If you want to know more, have a look at www.tue.nl/dsce/rp/cj

  • Hi Rene,

    I share your concern that experts won't come up with the right questions. They don't know what they can ask. My recommendation therefore is to do it iteratively: try to get some (basic) questions from them, extract data, perform analysis, present the results, and then they will certainly have more questions, then repeat. This usually works and at...

  • Thank you

  • Thank you

  • Hi Marco,

    If you want to do this I suggest to look into CPN Tools. This is a process modelling tool supporting colored Petri nets (e.g. Petri net with data), but also simulation into event logs/data. Very powerful tool but maybe a bit complex to learn.

    Other options are:
    https://github.com/tjouck/PTandLogGenerator
    http://plg.processmining.it/

    Hope...

  • Hi Vassily,

    I'm not a linux expert but on Windows it matters whether you execute ProM in 32bit or 64bit mode which Java (32/64) is found. Could this be the case for you as well? e.g. check whether the mode you run ProM in corresponds to the Java bit version.

    Hope this helps!

  • Hi Jan Maarten,

    Technically your point is correct. However, Petri nets do not record data by default. So you could also interpret it as a case is waiting (in the place before accept/decline), and the user is given these 2 activities to execute, hence making the decision. Or, even the system could automatically decide which of these two to enable.

    I...

  • Hi Atilla,

    This is the 'MOOC effect': many people subscribe, most visit the course, of those most start, but people start to 'drop off' as the course progresses...
    But great to read you made it till here!

  • Hi Enrique, Jose and Abhishek,

    You're right, because after Igor's comment I fixed the question :)
    So I'm afraid I still owe Igor a beer :D

  • Hi Luuk,

    Great to read!

    Would you be OK if we quote part of your comment (with attribution) to promote the other runs of this course?

  • Hi Atilla,

    Great to read!

    Would you be OK if we quote your comment (with attribution) to promote the other runs of this course?

  • Hi Brian, just checking in to make sure that you realize this is not mandatory material for this course, but more extra material in case you need it to solve an issue etc.

  • Hi Winifred,

    Thanks for your comments, and for subscribing to this course.

    I understand that the content can be hard if you have little experience in related fields. This course is rather specialized, so don't feel bad if you decide to withdraw! Not every topic is for everyone. I would for instance struggle with a water scarcity course :)

  • Hi Srinivas,

    Thanks for your post, and sorry to hear you're experiencing issues.
    You could indeed try to run it as administrator.

    You could also try to run it and then do not interact with your computer. The message will then pop-up and stay on top. The problem is that the message sometimes gets hidden by other programs.

    Hope this helps!!!

  • Wow, you're right! And this after 1000 students subscribed to this course! I think I owe you a beer :)

  • Hi Sanaz,

    In general there are 3 versions of ProM:
    lite: ProM with only a small set of the most-used and better tested plug-ins.
    6.x: yearly ProM releases such that we can cite ProM plug-ins in papers (e.g. the plug-in used in this paper is present in ProM 6.4)
    developer/nightly build: a ProM version where the packages are updated very frequently, i.e....

  • Hi Wessel,

    thanks for your question, the photo really helps.

    'Black squares' are what we call 'silent transitions'. In other words, think of them as normal activities, but these don't leave a trace or need a human to be executed, e.g. they don't represent work but only move the tokens around the Petri net.
    Consider for instance the silent transition on...

  • Hi Brian,

    Thanks for your question.

    Careflow mining indicates that process mining is applied on the care flow of a patient. This is actually what we've been doing for most of this course. However, consider that process mining can be applied on different aspects or processes in healthcare, for instance financial processes etc.

  • Hi Wessel,

    There are different kinds of loops.
    For instance, there could be a small loop which allows only one activity to be repeated. This does not really add for much behavior (although in theory this one activity can be repeated infinitely often).
    Another extreme is when all activities are in a choice, with a loop around it. This means that at the...

  • Hi Brian,

    I can definatively see the first steps of your approach. Depending on how good the suggestions/recommendations are perceived by the users, they might enable 'auto pilot' mode.
    However, I believe the analogy with Tesla's assisted driving holds: The driver is still in control and responsible, just heavily supported by the smart auto pilot/assist...

  • Hi Atilla,

    Thanks for your comment.

    The concept:name is indeed the unique identifier of the case/trace, often used to be able to look up the original data in the original IT system.

    In the table I have shown as small sample of an event log, which contains several cases/traces, and therefore several IDs. At the same time, each trace/case has several...