I’ve been reviewing papers for the International Conference on Business Process Management (BPM) over the last 3 or 4 years, in 2011 and 2012 as a member of the Program Committee. In that time my evaluations of submitted research papers have been to reject the paper in the very vast majority of cases. This year the best score I gave was a borderline on one paper, and a reject on all other papers (8 in total), other years were a bit better, but not much. In the following I’d like to share my view on why the papers were rejected. Paper authors may find this interesting to learn how they can improve their chances of getting a paper accepted. Perhaps more importantly, it also sheds a light on publishing standards within BPM research and what the BPM community as a whole can do to promote these standards.
basics are not the problem
Yes, BPM is a very competitive conference with a low acceptance rate (around 25 papers out of 200-300 submitted manuscripts get accepted). But what I find surprising is the significant difference between papers that get accepted and papers that get rejected. Even more, papers were mostly rejected for the same kind of reasons. The majority of the papers I got to review, including the rejected ones succeeded on all basics in research papers: they
- addressed a problem that is relevant to BPM,
- had a clear problem statement,
- proposed an interesting and novel idea to solve the problem,
- were written in good English, and
- had a decent structure.
So what was wrong with these papers?
the usual suspects: more details, more literature
Some papers did not explain an idea well enough. The questions I always ask myself is “Do I believe this works?” or even better “Am I now would be able to build the solution myself (with some more reflection on technical details on following up cited work)?“. Often, a crucial (technical) notion was not clear, for example: how exactly is this process graph constructed?
Many papers had flaws in literature study and comparison. The standard issue: Someone else has published a paper that proposes a solution very similar to a (part of) the contribution in the new paper, and the new paper does not discuss how it differs from the old one. Thorough literature research is hard work and chances are that one did not find a paper known by a reviewer. However, a paper should at least discuss relevant results published in previous years of the very same conference. I usually ask myself the question “Is there an difference to previous work that matters such as a more general class of problems solved, a faster solution, a better solution, a more elegant solution, …?”
These two reasons probably apply to paper writing and paper rejections in general. The next two reasons are different as they are particular to what specifically the BPM community expects from a research paper.
show stopper at the start: existing problem canon
Quite a few texts lacked addressing an important aspect of the problem that has been raised and discussed in earlier works. Many papers submitted to BPM address problems that are part of a larger, more general problem such as compliance, modeling and verification of data-dependent processes, adapting processes, process mining etc. These more general problems have been tackled from various angles and the problem has been understood better and better, which also creates a canon of evaluating solutions:
- What is an acceptable solution?
- How to measure the quality of a solution?
- What are relevant factors?
- For which use cases should a solution apply?
- What kind of assumptions can one make on the problems to solve?
I frequently found papers to just focus on a single aspect of the problem while ignoring other important aspects, that, by current state of knowledge should not be ignored. For example,
- A paper on process mining cannot avoid the discussion on which quality measure of a process model is optimized by the algorithm and which one is neglected.
- A paper on verifying processes for correctness cannot avoid listing BPM-specific properties that shall be checked such as the notion of soundness.
- A paper on designing or extending process modeling languages cannot ignore existing modeling languages and their particular treats such as BPMN being industry standard “for everything”, BPEL being the standard for executable service models, Petri nets being the major formal model in most tasks.
These are just examples. In their core they are variants of the following question: “So you have this nice technique, how exactly does it help me to solve my BPM problem? Oh and by the way, here is a book of standard requirements you should meet anyway.”
stale end: unconvincing evaluation
Finally, the majority of all reviewed papers failed in having a convincing evaluation.
There are papers which are entirely conceptual and novel in a way that the problem was not discussed before, or a known problem is solved for the very first time. In these cases (and probably in a few more) the idea alone is a contribution that is worth discussing without even having a large scale practical evaluation. A decent running example then usually suffices to illustrate the potential of the idea.
However, these papers are rare. Most papers have an incremental element: they
- solve an existing problem better than previous solutions,
- generalize an existing solution of a known problem,
- improve an existing solution a way that can be measured, or
- combine existing techniques to solve a novel or unsolved problem.
The consequence of this incrementally is that reviewers and readers expect some proof that “things work”. Many of the papers I have seen actually did have an experimental or practical evaluation: ideas were implemented in a tool and then applied to artificial or real-life data.
What the papers actually lacked was a presentation of convincing results. It is usually not sufficient to just show a large table with numbers where some column has values in a range considered “good” (fast analysis, small model, high similarity, high confidence, etc.).
BPM is a discipline about making very complex software understandable to humans – most likely people with a less technical background. If a technique is about extracting/checking/changing information that is in any way interesting to look at for a human being, then a reader and a reviewer likes to see this information in an understandable form. Here are some examples I can think of:
- If a technique produces or transforms a model (a process model, a data model, …), then the evaluation should show some model diagrams, not just a table with model statistics.
- If a technique checks for errors in some kind of input, it would be worth illustrating identified errors and/or diagnostic information on this input.
- If a technique queries models from a repository, the evaluation should show a few models that were returned as the result of a given query.
- If a technique is about relating two or more things (say compare different versions of a process model), then the evaluation should show this input and highlight similarities and differences.
These are just a few examples taken from the kind of papers I’ve been reviewing. Probably every technique that solves a BPM problem has an artifact worth showing. Such an evaluation showing relevant problem instances and results does not replace a fully fledged case study, but it can be convincing enough for reviewers and readers that the technique works.
In case a technique actually is all about number crunching (or has a significant part of number crunching), the table should not miss measures that are considered relevant in that problem domain: see ‘existing problem canon’ above.
Finally, I have seen a few papers that did not compare the results of their technique to results obtained by existing techniques (on the same input). Making such an evaluation is a tedious task. It requires mastering other techniques and tools that all have their small hidden assumptions about input format and runtime environment. Yet, experimentally comparing a new technique to state of the art is imperative – if that state of the art is solving the same problem. Simply because the quality of techniques is evaluated best by evaluating the quality of their outputs.
homework for the BPM community
You may find the findings I listed here trivial and not worth reporting, because they state the obvious. But my reviews show otherwise. The number of submissions to our conferences show that many researchers would like to actively contribute to BPM with their ideas, while the reviews show that they are not aware of the standards by which we, the BPM community, review our peers.
It seems they and we could benefit from more transparency in the problem/solution canon we maintain and the requirements we raise for evaluation. It could also help attracting researchers from other related fields such as software engineering.
I hope this posts helps other BPM researchers understand better which quality standard I’m applying when reading and reviewing papers – and try to adhere to when writing. My observations are necessarily stated from a personal point of view and could be biased. If you would like to add your own observations or have a different opinion, I’d be interested to hear it.
disclaimer: I’ve not only reviewed papers for BPM, but also submitted papers to BPM. I try to adhere to the same quality standards in my own papers as the standards I am applying in reviews. However, I cannot guarantee I would accept my own papers under these standards.