https://www.dagstuhl.de/13481

# Unleashing Operational Process Mining

## Organisatoren

Rafael Accorsi (Universität Freiburg, DE)
Ernesto Damiani (Università degli Studi di Milano – Crema, IT)
Wil van der Aalst (TU Eindhoven, NL)

## Summary

Society shifted from being predominantly "analog" to "digital" in just a few years. This has had an incredible impact on the way we do business and communicate. Gartner uses the phrase "The Nexus of Forces" to refer to the convergence and mutual reinforcement of four interdependent trends: social, mobile, cloud, and information. The term "Big Data" is often used to refer to the incredible growth of data in recent years. However, the ultimate goal is not to collect more data, but to turn data into real value. This means that data should be used to improve existing products, processes and services, or enable new ones.

Event data are the most important source of information. Events may take place inside a machine (e.g., an X-ray machine or baggage handling system), inside an enterprise information system (e.g., an order placed by a customer), inside a hospital (e.g., the analysis of a blood sample), inside a social network (e.g., exchanging e-mails or twitter messages), inside a transportation system (e.g., checking in, buying a ticket, or passing through a toll booth), etc.

Process mining aims to discover, monitor and improve real processes by extracting knowledge from event logs readily available in today's information systems. The starting point for process mining is an event log. Each event in such a log refers to an activity (i.e., a well-defined step in some process) and is related to a particular case (i.e., a process instance). The events belonging to a case are \emph{ordered} and can be seen as one "run" of the process. Event logs may store additional information about events. In fact, whenever possible, process mining techniques use extra information such as the resource (i.e., person or device) executing or initiating the activity, the timestamp of the event, or data elements recorded with the event (e.g., the size of an order).

Event logs can be used to conduct three types of process mining. The first type of process mining is discovery. A discovery technique takes an event log and produces a model without using any a-priori information. Process discovery is the most prominent process mining technique. For many organizations it is surprising to see that existing techniques are indeed able to discover real processes merely based on example behaviors stored in event logs. The second type of process mining is conformance. Here, an existing process model is compared with an event log of the same process. Conformance checking can be used to check if reality, as recorded in the log, conforms to the model and vice versa. The third type of process mining is enhancement. Here, the idea is to extend or improve an existing process model thereby using information about the actual process recorded in some event log. Whereas conformance checking measures the alignment between model and reality, this third type of process mining aims at changing or extending the a-priori model. For instance, by using timestamps in the event log one can extend the model to show bottlenecks, service levels, and throughput times.

Process mining algorithms have been implemented in various academic and commercial systems. The corresponding tools are being increasingly relevant in industry and have proven to be essential means to meet business goals. ProM is the de facto standard platform for process mining in the academic world. Examples of commercial tools are Disco (Fluxicon), Perceptive Process Mining (before Futura Reflect and BPM|one), QPR ProcessAnalyzer, ARIS Process Performance Manager, Celonis Discovery, Interstage Process Discovery (Fujitsu), Discovery Analyst (StereoLOGIC), and XMAnalyzer (XMPro). Representatives of ProM community and the first three commercial vendors participated in Dagstuhl Seminar 13481 "Unleashing Operational Process Mining".

The Dagstuhl Seminar was co-organized with the IEEE Task Force on Process Mining (see http://www.win.tue.nl/ieeetfpm/). The goal of this Task Force is to promote the research, development, education and understanding of process mining. Sixty organizations and over one hundred experts have joined forces in the IEEE Task Force on Process Mining.

Next to some introductory talks (e.g., an overview of the process mining field by Wil van der Aalst), 31 talks where given by the participants. The talks covered the entire process mining spectrum, including:

• from theory to applications,
• from methodological to tool-oriented,
• from data quality to new analysis techniques,
• from big data to semi-structured data,
• from discovery to conformance,
• from health-care to security, and
• from off-line to online.

The abstracts of all talks are included in this report.

It was remarkable to see that all participants (including the academics) were very motivated to solve real-life problems and considered increasing the adoption of process mining as one of the key priorities, thereby justifying the title and spirit of the seminar, namely "Unleashing the Power of Process Mining". This does not imply that there are not many foundational research challenges. For example, the increasing amounts of event data are creating many new challenges and new questions have emerged. Such issues were discussed both during the sessions and on informal meetings during the breaks and at the evening.

Half of the program was devoted to discussions on a set of predefined themes. These topics were extracted based on a questionnaire filled out by all participants before the seminar.

Process mining of multi-perspective models (Chair: Akhil Kumar)
1. Data quality and data preparation (Chair: Frank van Geffen)
2. Process discovery: Playing with the representational bias (Josep Carmona)
3. Evaluation of process mining algorithms: benchmark data sets and conformance metrics (Chair: Boudewijn van Dongen)
4. Advanced topics in process discovery: on-the-fly and distributed process discovery (Chair: Alessandro Sperduti)
5. Process mining and Big Data (Chair: Marcello Leida)
6. Process mining in Healthcare (Chair: Pnina Soffer)
7. Security and privacy issues in large process data sets (Chair: Simon Foley, replacing Günter Müller)
8. Conformance checking for security, compliance and auditing (Chair: Massimiliano De Leoni, replacing Marco Montali)
9. How to sell process mining? (Chair: Anne Rozinat)
10. What is the ideal tool for an expert user? (Chair: Benoit Depaire)
11. What is the ideal tool for a casual business user? (Chair: Teemu Lehto)

Summaries of all discussions are included in this report.

The chairs did an excellent job in guiding the discussions. After the each discussion participants had a better understanding of the challenges that process mining is facing. This definitely include many research challenges, but also challenges related to boosting the adoption of process mining in industry.

The social program was rich and vivid, including an exclusion to Trier's Christmas market, a night walk to ruins, table football, table tennis, and late night discussions.

Next to this report, a tangible output of the seminar is a special issue of IEEE Transactions on Services Computing based on the seminar. This special issue has the title "Processes Meet Big Data" and will be based on contributions from participants of this seminar (also open to others). This special issue of IEEE Transaction on Service-Oriented Computing is intended to create an international forum for presenting innovative developments of process monitoring, analysis and mining over service-oriented architectures, aimed at handling big logs'' and use them effectively for discovery, dash-boarding and mining. The ultimate objective is to identify the promising research avenues, report the main results and promote the visibility and relevance of this area.

Overall, the seminar was very successful. Most participants encouraged the organizers to organize another Dagstuhl Seminar on process mining. Several suggestions were given for such a future seminar, e.g., providing event logs for competitions and complementary types of analysis before or during the seminar. These recommendations were subject of the discussion sessions, whose summaries can be found below.

Creative Commons BY 3.0 Unported license
Rafael Accorsi, Ernesto Damiani, and Wil van der Aalst

## Classification

• Data Bases / Information Retrieval
• Modelling / Simulation

## Keywords

• Conformance checking
• Process discovery (excerpt!)

## Buchausstellung

Buchausstellung im Erdgeschoss der Bibliothek

(nur in der Veranstaltungswoche).

## Dokumentation

In der Reihe Dagstuhl Reports werden alle Dagstuhl-Seminare und Dagstuhl-Perspektiven-Workshops dokumentiert. Die Organisatoren stellen zusammen mit dem Collector des Seminars einen Bericht zusammen, der die Beiträge der Autoren zusammenfasst und um eine Zusammenfassung ergänzt.