Mining Branching-Time Scenarios From Execution Traces

Over the last two years, I have been working with Shahar Maoz and David Lo on discovering high-level specifications of an application from its execution traces. This topic is also known as specification mining and has been brought up around 2002 because we we keep on writing humongous amounts of undocumented code that other people have to use, maintain, or deal with in a later iteration of the software. Getting into this “legacy” code is extremely time consuming and there is a high chance that you will break something because you do not understand how the existing code works.

Specification mining aimes at extracting (mostly) visual representations of existing code that describes its essential architecture and workings at a higher level of abstraction, so that a developer can first get an overview before diving into the code.

We looked particularly at the problem of understanding object interplay in object-oriented code, where you often have many objects that invoke methods of other objects, essentially distributing a behavior over many classes. Everyone who ever tried to understand code that uses a number of architectural patterns combined, such as factories and commands, knows what I mean.

Building on earlier works, we found a way to discover scenarios (something like UML sequence diagrams) that show how multiple objects interact with each other in particular situation. More importantly, we found a way to discover and distinguish two kinds of scenarios:

  • scenarios that describe behavioral invariants of your application (whenever the user presses this button, the application will open that window), and
  • scenarios that describe behavioral alternatives (when the user is on this screen, she may continue with this, that, or even that command)

These two scenarios combined give a good understanding of how an application works at a high level of abstraction. Here is the presentation showing you how it works:

You can also get the full paper, our tool for discovering scenarios, and the data set that we used in our evaluation.

Drop me an email or a comment if you find this useful and/or would like to know more.