• Nirmal Patel

Let’s do Educational Process Mining!

Updated: Dec 5, 2021

Learn how we can use learning process data to find effective learning patterns.

The Origin: Business Process Mining

Process Mining keeps getting bigger everyday. Commercial success of Process Mining as a data mining method becomes apparent by seeing the rise of unicorn companies like Celonis. Process Mining methods are built to discover end-to-end business processes models from ERP data. If you have a timestamped event log of different business actions over time, you can discover a flowchart business process model from the data. Below is an example of mining a Fuzzy Process Model from the data of loan application processing:

In the process model above, you can see how a loan is processed by starting with the application registration and ended by sending a decision email. The model above gives us a rough idea of how the application process works.

Using process models, you can discover the underlying process in the data, you can compare this process against an idea process (conformance), and you can enhance or improve the process.

Process Mining has been used extensively in different business domains to understand and improve business processes. It is a promising technology.

So can we use Process Mining in education? The answer is yes!

Educational Process Mining (EPM)

EPM is starting to see traction in the Educational Data Mining community (read this review by Boragin et. al. to find out more). More researchers are trying it out to see if EPM can help then generate new hypotheses and explore the student data better.

Process Mining helped us win the 2019 NAEP Data Mining Challenge. We added process related features in our predictive model which were repeatedly selected by the Genetic Feature Selection Algorithm.

You can use EPM to get more understanding of how students behave and learn on online learning platforms. The data required for EPM is quite simple. You just need a CSV with 3 columns: User ID, Action, Timestamp. Just with these 3, you can start making process models! Let’s see some examples of the process data.

Examples of student process data:

  • Student actions over time in LMS e.g. login, logout, video, test, etc.

  • Educational materials that student accessed over time. This can be at the topic, subtopic, or resource level.

  • Student actions during problem solving e.g. start, use calculator, use draw pad, eliminate answer, reset question, etc.

From the data listed above, we can make the process models. Here is a concrete example of how the curriculum over time data can look like:

Keep dimensionality of the ‘event’ field low. If you have too many types of events, your process model will become complicated very quickly.

Let us now see some examples of educational process models:

Example 1: LMS Usage

In this process model, we can see that after logging in, students do one of the few different actions, and then log out.

Example 2: Curriculum Usage

In the process model below, we can see how students go through 5 different lessons of a topic. They sometime switch jump between the topics, but in almost half of the cases, they go to the next lesson.

Applications of Educational Process Mining

Now that we’ve seen some examples of EPM, let’s briefly discuss how we can apply EPM in real world situations.

Improving online learning platforms

We can use EPM to find out how students and teachers use online learning platforms. By understanding the UI navigation process of the users, we can improve the user experience in a way that lowers the amount of unwanted user behavior. For example, if we find out that 10% of the users log out after a particular sequence of events, we can give users a different experience so that they remain on the platform.

Finding effective learning processes

As we saw in Example 2 above, we can use curriculum data to find out how students go through the curriculum over time. If we can separate out students by their performance or growth measures, we can see which learning processes are followed by high growth and high performance students. If these processes make sense to the curriculum designers and/or learning scientists, then we can recommend them to other students. We can start giving personalized recommendations to the low performing students who are drifting far away from the idea processes.

These are just some of the many possible applications.

Resources & Tools

Please visit www.processmining.org to get more resources on Process Mining. To learn PM, the best resource is the Coursera course Process Mining: Data Science in Action.

Below is a list of process mining tools that you can use to get started with EPM immediately:

  • ProM — The de-facto open source process mining platform, complex to use, contains most of the latest algorithms

  • Disco — Commercial tool that is easy to use, has academic license

  • bupaR — R package for Process Mining

  • fuzzymineR — R package to mine Fuzzy Process Models

  • PM4Py — Python library for Process Mining


In this article, we introduced Educational Process Mining, showed some examples of the process data and models, and some particular use cases. We hope that this summary is helpful to you, and will inspire you to use Process Mining on the student data that you or your organization has.

If you need any further help, please reach out!


Recent Posts

See All