How Do People Create Process Models?

Over the last 7 years, I have been collaborating with some colleagues on a number of experiments where we investigated how people create process models. In particular, we wanted to see where and how they differ and whether their personal unique “modeling style” has an impact on model quality. In this – rather long – blog post, I want to summarize what we found out and point to the different studies that we published. (To be honest, I collected this information for a Master student who wants to replicate some of these studies, but I might as well share it with others). So, here we go.

First experiment: organize your process description!

In 2010, we conducted a first structured experiment on quality of modeling outcomes. We compared how the way an informal requirements document is organized impacts the quality of the created model (modelers get a text about a process and have to create a graphical model – say in BPMN). Spoiler: a breadth-first description of the process works best.

ppm

Models were created more accurately when the process description was given in breadth-first order.

Jakob Pinggera, Stefan Zugal, Barbara Weber, Dirk Fahland, Matthias Weidlich, Jan Mendling, Hajo A. Reijers: How the Structuring of Domain Knowledge Helps Casual Process Modelers. ER 2010: 445-451 http://dx.doi.org/10.1007/978-3-642-16373-9_33

The conceptual background for this and subsequent experiments were two paper investigating the nature of modeling languages regarding how they use particular modeling concepts to structure knowledge about a process.

  • Dirk Fahland, Daniel Lübke, Jan Mendling, Hajo A. Reijers, Barbara Weber, Matthias Weidlich, Stefan Zugal: Declarative versus Imperative Process Modeling Languages: The Issue of Understandability. BMMDS/EMMSAD 2009: 353-366 http://dx.doi.org/10.1007/978-3-642-01862-6_29
  • Dirk Fahland, Jan Mendling, Hajo A. Reijers, Barbara Weber, Matthias Weidlich, Stefan Zugal: Declarative versus Imperative Process Modeling Languages: The Issue of Maintainability. Business Process Management Workshops 2009: 477-488 http://dx.doi.org/10.1007/978-3-642-12186-9_4

Visualizing how people model

In 2011, we published a paper describing a software platform for recording and analyzing modeling actions on a canvas. We also describe the visualization of modeling actions in a time-series diagram where specific phases in the modeling process (creating elements, arranging existing elements, deleting elements, thinking about the process) can be identified and highlighted as illustrated below.

ppm

In the experiments, we could observe significant differences between how different modelers approach the same modeling task – manifesting itself in remarkably distinct modeling phase diagrams.

ppm

Jakob Pinggera, Stefan Zugal, Matthias Weidlich, Dirk Fahland, Barbara Weber, Jan Mendling, Hajo A. Reijers: Tracing the Process of Process Modeling with Modeling Phase Diagrams. Business Process Management Workshops (1) 2011: 370-382 http://dx.doi.org/10.1007/978-3-642-28108-2_36

Identifying modeling styles

In 2012, we analyzed these differences between how modelers approach a modeling task further. We plotted the number of creation, deletion, and re-arranging actions on the canvas on a time-series. We binned these modeling actions into segments of 10 seconds length; each second has a particular “modeling profile” of creation, deletion, and re-arranging actions. We then clustered users based on their “modeling profiles”, i.e., typical occurrences of create/delete/move actions throughout their modeling, and identified three unique clusters of “modeling profiles”. Below is the “modeling profile” of the cluster showing many creation operations early in the modeling and few delete operations.

ppm

Jakob Pinggera, Pnina Soffer, Stefan Zugal, Barbara Weber, Matthias Weidlich, Dirk Fahland, Hajo A. Reijers, Jan Mendling: Modeling Styles in Business Process Modeling. BMMDS/EMMSAD 2012: 151-166 http://dx.doi.org/10.1007/978-3-642-31072-0_11

We then conducted a subsequent, more detailed analysis of these clusters and also investigated the modeling phase diagrams of each cluster. First, we could establish that there are statistically significant differences between the three clusters in (1) speed of adding modeling elements, (2) duration of phases of improving the model layout and elements moves in a phase of layouting, (3) time between adding model elements, thinking about the model, and adding further model elements. Altogether, we could then characterize 3 unique modeling styles from these clusters

  1.  Quick modelers who (after some initial deliberation on the process), create an almost correct model right away and only need minimal adjustments of model layout and few thinking pauses
  2. Modelers who model at a slower pace and make regular and longer layouting breaks (possibly to plan their next modeling steps)
  3. Modelers who also model at a slower pace but require less layouting than the previous group.

This analysis also gave us a first idea into which factors influence how people approach a modeling task. The central two factors are (1) the cognitive load created by the modeling tasks, largely influencing the efficiency with which the model is created, and (2) tool support for layouting, largely influencing the amount of time spent on organizing the model on the canvas.

Jakob Pinggera, Pnina Soffer, Dirk Fahland, Matthias Weidlich, Stefan Zugal, Barbara Weber, Hajo A. Reijers, Jan Mendling: Styles in business process modeling: an exploration and a model. Software and System Modeling 14(3): 1055-1080 (2015) http://dx.doi.org/10.1007/s10270-013-0349-1

Modeling style vs model quality

In a second line of analysis, we investigated how the way modelers create their models impacts the quality of the resulting model. By analyzing modeling operations at a more fine-grained level and also considering the modeling elements themselves, we could compare modeling processes at a more detailed level. Below, we see visualizations of four different modelers creating the same model (visualized using the DottedChart plugin of ProM. Each line corresponds to a modeling element (node or arc), green dots show creation operations, blue dots show move operations, and red dots show delete operations.

ppm

By analyzing the location of modeling elements on the canvas, and the time between different modeling activities, we could confirm three hypotheses:

  1. Structured modeling (e.g., in clearly defined blocks) is linked to better model quality
  2. lots of movement of modeling objects is linked to lower model quality, and
  3. low modeling speed is linked to low model quality.

Jan Claes, Irene T. P. Vanderfeesten, Hajo A. Reijers, Jakob Pinggera, Matthias Weidlich, Stefan Zugal, Dirk Fahland, Barbara Weber, Jan Mendling, Geert Poels: Tying Process Model Quality to the Modeling Process: The Impact of Structuring, Movement, and Speed. BPM 2012: 33-48 http://dx.doi.org/10.1007/978-3-642-32885-5_3

The impact of structured modeling on modeling quality was analyzed further. In a further set of experiments, factors that impact the cognitive load of the modelers were analyzed. In particular, the researchers looked for factors that help to reduce the cognitive load of the model thus helping him to have more cognitive capacity to create correct models. Besides confirming and deepening the 2010 experiment (structured breadth-first organization of process knowledge improves model quality), the experiment also shows that the characteristics of the modeler impact model quality: A modeler may have a preference of structuring knowledge in a particular way. If process knowledge is presented to them fitting their preference, the individual cognitive load is lower and model quality increases. The image below shows  “aspect-oriented” modeling, where a modeler first finishes a first aspect of the model, then works on a second aspect that may involve many modeling elements created earlier.

ppm

Jan Claes, Irene T. P. Vanderfeesten, Frederik Gailly, Paul Grefen, Geert Poels: The Structured Process Modeling Theory (SPMT) a cognitive view on why and how modelers benefit from structuring the process of process modeling. Information Systems Frontiers 17(6): 1401-1425 (2015) http://dx.doi.org/10.1007/s10796-015-9585-y

The following, longer journal paper summarizes several techniques for visually analyzing the process of process modeling from various angles.

Jan Claes, Irene T. P. Vanderfeesten, Jakob Pinggera, Hajo A. Reijers, Barbara Weber, Geert Poels: A visual analysis of the process of process modeling. Inf. Syst. E-Business Management 13(1): 147-190 (2015)  http://dx.doi.org/10.1007/s10257-014-0245-4

For the really interested, there are 2 PhD theses on the topic:

noise canceling, or: what Beethoven has to do with your business

You know the problem. You are on your local commute, in a train, on the plane and all you want to do to kill the time is listen to your most favorite album, audio book, radio program, or latest TV episode. And while all this audio is there, coming to you via your headphones, you also hear the train rattling, the engines bustling, and people talking. So, all the experience you area looking for is dampened by inevitable noise.

Processes are the same. The only thing you want to do in your business is provide service to your customers, build a neat product, invent the next top-notch thing, or just pay some bills. And then reality comes and puts in all that noise into your business like phone calls, non-working printers, unprovided services, telephone hotlines, late clients, ill staff, … And about all that dealing with life, you forget about what you are good at, and you don’t know where you lack support or where you could improve. The good news is, that there is some neat technique around, called process mining. Process mining  is like a consultant that can speak to your IT equipment to tell you what your business is actually doing. The problem is that this consultant has a very sensitive ear. It hears far more noise than the Beethoven sonata you thought your business will be and it will tell you not only about Beethoven but also about all the noise that it has heard.

Enter: the noise canceling headphones. They let you enjoy your favorite piece of audio in the average noisy environment by filtering your environment’s humming and chatter from the sound waves that reach your ear.

Last week, I’ve accomplished something similar for our consultant with overly sensitive ears. I’ve built some noise filtering algorithms that work a little like noise canceling headphones for process mining. So, instead of now telling you about a business process soaked in noise (on the left), you may actually get to know about the actual process in your business (on the right). And the amazing thing is: this works on real data.

filtering a mined process model

So, enjoy your Beethoven.

activity report: making the brand

I’m slightly late in fulfilling my promise of regularly telling something. At least, I have a good excuse. So here’s the line-up of stuff that’s been around the last three weeks.

  • I’ve found my PhD topic more or less. I’m working out some ideas and present them to my prospective PhD supervisors next week. So they can tell me that 90% of my ideas have already been tried and I’m left with a bunch of no-problems and unsolvable ones and that the rest is far too much for me to do in a single PhD thesis. We’ll see. But I’m quite sure that I’ll do something on Declarative Modelling and Verification of Workflows (for disaster management).
  • I’ve attended a soft-skills workshop on project management, leadership, and networking with my collegeagues from the Graduiertenkolleg. It’s been quite useful as we were working out a number of projects for the next phase of the research group, and it brought the team members closer. Many thanks to Golin Wissenschaftsmanagement for that one…
  • I’ve attended our Kolleg’s first workshop on “Meta-Modelling” which, as far as I can tell, is an approach to get control over the development of modeling and programming languages and their changes. The methodology, which is also going to be developed in Metrik, might prove to be useful when I’m starting to relate constraints and operational concepts.
  • I’ve helped in preparing our Kolleg’s second workshop on “Workflows” which gets together people from Metrik and the B.E.S.T program. We will work together with Prof. Wil v.d. Aalst and Prof. Kees van Hee on (work)flow techniques for services and service oriented architecture. I hope to get some more thoughts on how the term “service” relates to wireless (sensor) networks and their functionality.
  • I’ve continued on our leporello leaflet and it looks great – we’re almost done with it. In that process, we’ve developed some sort of “corporate identity” for Metrik. Together with a decent web strategy, we’re making our way up in the Google rankings.

That’s it for now. I need to prepare my PhD topic presentation…

activity report

This is a new series of blog posts that has the purpose of keeping me posting once a week and documenting my progress more tightly. Looking back on the past few weeks was rather disappointing in that sense: I am doing little on my research and too much on university management skills. I hope that things change once I have to re-read the little progress I’ve made. Here we go.

  • I’ve isolated the topic of understanding the ideas of ‘service’, ‘service oriented architecture’ and related to that ‘service level agreement’. The main reason being that everybody speaks of SOA and the related terms that one is inclined to think that these are well-understood topics. Unfortunately, if you think about applying SOA on, and creating Services for Wireless Sensor Networks, you end up with nothing to start with. ‘Service’, ‘SOA’ and all other sorts have been defined in the field of business process management and workflow management with lots of hard- and software technology. But people keep stressing that SOA is an architectural paradigm. I haven’t found a non-technological definition yet. That’s what I’d like to understand, What is SOA? What is a Service in SOA?
  • I’ve deepened my understanding of flexible workflows and adaptivity concepts and workflows. I still appreciate van der Aalsts classification of fexibility and adaptivity of workflows. And I started to understand how one could realize flexible workflows – thanks to Sadiq, Sadiq and Orlowska.
  • I’ve continued to supervise four students in a tutorial project related to a lecture on information integration. It’s strange how the perspective on the matter changes once you’ve earned a degree. I can’t be anything else because ‘my’ students are of my age, studied even longer and have industrial experience. Still they are rather reluctant to solve the tiny tasks I am issuing.
  • I’ve continued planning a small workshop for my group next april and another small workshop with visiting researchers this december.
  • I’ve spent a day at the GeoForschungsZentrum Potsdam with my PhD graduate school to learn about the Tsunami Early warning system in the Indian Ocean and further research topics on natural disasters and disaster management projects.
  • I’ve been working with some of my fellows on a leporello leaflet about our graduate school to improve outward communication. We’re doing pretty good things. You just realize that when you’re writing things down in a compact way avoiding the unnecessary talking, condensed down to the facts in a lean and clean argumentation.
  • And I’ve learned about simulating the distributed detection of earth quakes in the SAFER project that aims on building an early warning system by the help of sensor networks. They are our closest partner project and we are likely to get a decent amount of input regarding technical requirements to implement a reliable system to react on earthquakes or other unpredicted hazards.

Altogether, that’s been pretty much stuff. Yet, I can’t feel progress. I hope that’s going to change with this column.

go with the flow

It’s now for about five weeks that I have done almost nothing on my thesis. There were just too many other “important” things to be done. Like creating a poster for a workshop, reviewing papers and other peoples’ thesises, preparing a lab-tutorial for a lecture and going on seminars…

At least the seminar gave me the opportunity to talk about my thesis topic for at least twenty times, each time to a different person. With the questions a got back to be answered, I realized that the topic is going to be quite ambitious. “Workflows in a disaster management system.” I still have absolutely no idea what that could be. Talking to a good friend of mine last weekend, the idea formed that I really should go out there to the people who actually do disaster management. They certainly will know their workflows (maybe they don’t use that term, but they’ll know them). The question for them is: are they going to like a system that supports them by telling who is going to do what and when? My question is: are their workflows interesting enough?

Besides that, I just re-checked my project outline. That one actually says that I shall also focus on workflows of a resource-managment-layer in a peer-to-peer-middleware architecture. Is there a difference in these two tasks or is it just the same? I’d prefer if there were no difference. Workflow is workflow, right?

the gap in between

Roughly five weeks have passed since I officially started the work on my PhD. In between I attended a summer school on the convergence of some quite-hype technologies like wireless sensor networks, RFID and peer-to-peer techologies, organized by people from the TU Darmstadt (DVS and KOM). I’m working on a paper about this with a bunch of really nice people. It’s gonne be interesting. I also spent some days in Eindhoven with my group from HU Berlin talking about the research in workflow modelling and analysis and starting a cooperation on common research interests.

In the last three weeks I realized that there is some hap between the idea of having a distributedly designed infrastructure for disaster management and designing applications for these things, which is meant to be the topic of my PhD thesis. The meta-problem seems to be that the these ultra-cool wireless sensor networks won’t be doing much more than sensing and smart routing of data. Not much of a workflow there. But if workflows are meant to be used in a setting with tens or hundreds of thousands of nodes and interconnected devices to assist in disaster management and recovery, where are they going to appear? And assuming we have an answer for that: who or what is going to execute or enact them? The answer to the latter problem is quite likely: some central node with lots of computing power. But we already know how that works (more or less) and there won’t be a new research challenge for me.

Without forgetting about the second question (still hoping for a more challenging answer), I am currently turning to the first question. I might answer it with these new questions:

  • What is a workflow in a disaster management system?
  • What does it look like?
  • What makes it different from other workflows?
  • Do we need new formal methods to do so?
  • How does self-organization and adaptivity affect these workflows?

One thing that is quite likely to be different and which is a little bit out of focus in the current workflow research are data and resources, which are a crucuial thing in disaster management systems at least. Could be a starting point.