Tutorial: Automating Process Mining with ProM’s Command Line Interface

In this blogpost I explain how to invoke the process mining tool ProM from the commandline without using its graphical user interface. This allows you to run process mining analyses on several logs in batch mode without user interaction. Before you get too excited: there are quite some limitations to this, which I will address in the end. The following instructions have been tested for the ProM 6.4.1 release.

Invoking the ProM Commandline Interface

The ProM commandline interface (CLI) can be invoked through the class

 org.processmining.contexts.cli.CLI

To properly invoke the CLI for ProM 6.4.1, use the following command (which is a copy of the command in ProM641.bat with changed main class).

java -da -Xmx1G -XX:MaxPermSize=256m -classpath ProM641.jar -Djava.util.Arrays.useLegacyMergeSort=true org.processmining.contexts.cli.CLI

The CLI itself has no interactive user interface. Instead, it executes scripts passed to it as commandline parameter. To simplify your life, I suggest to put the command into a batch file ProM_CLI.bat or shell script ProM_CLI.sh that passes on 2 commandline parameters. For instance

java -da -Xmx1G -XX:MaxPermSize=256m -classpath ProM641.jar -Djava.util.Arrays.useLegacyMergeSort=true org.processmining.contexts.cli.CLI %1 %2

A typical example script that the ProM CLI takes is the following script_alpha_miner.txt

System.out.println("Loading log");
log = open_xes_log_file("myLog.xes");

System.out.println("Mining model");
net_and_marking = alpha_miner(log);
net = net_and_marking[0];
marking = net_and_marking[1];

System.out.println("Saving net");
File net_file = new File("mined_net.pnml");
pnml_export_petri_net_(net, net_file);

System.out.println("done.");

You can invoke it with the command

ProM_CLI.bat -f script_alpha_miner.txt

It will read the log file myLog.xes (stored in the current working directory), invoke the alpha miner, and write the resulting Petri net as a PNML file mined_net.pnml to the current working directory. (No, there is currently no way to pass file names as additional commandline parameters to the script).

Note: when running the above script ProM will first produce a (large) number of messages on the screen during the startup phase related to scanning for available packages and plugins, bear with it until it is ready.

Scripts for ProM

The language used for the scripts is basically Java interpreted at runtime. In principle, you can put any Java code which you would put into a method body (no class/method declarations). In case the Java reflection framework is able to infer the type, variables do not have to be declared, but can just be used like in a dynamically typed language. For example variable logĀ in the script_alpha_miner.txt will be inferred to have type XLog.

In a script, you can directly invoke ProM plugins through special method names provided by the CLI; the method names are derived from the plugin names shown in ProM. For example the plugin “Alpha Miner” is available as method alpha_miner. You can get the full list of all ProM plugins available for script invocation with the command liner parameter ‘-l’ (“dash lower-case L”):

ProM_CLI.bat -l

This will scan all installed packages for plugins that do not require the GUI to run and list them in the form name(input types) -> (output types). For example, if you have installed the AlphaMiner package the following plugins will be listed (among many others).

alpha_miner(XLogInfo, LogRelations) -> (Petrinet, Marking)
alpha_miner(XLog) -> (Petrinet, Marking)
alpha_miner(XLog, XLogInfo) -> (Petrinet, Marking)

Use the ProM Package Manager to install plugins you do not find in the list of installed plugins.

The script_alpha_miner.txt uses the second method signature alpha_miner(XLog) -> (Petrinet, Marking) to discover from an XLog a Petrinet and a Marking. In case a plugin returns multiple objects, the return result is an Object[] array, in which you can access the individual components as usual, i.e., net_and_marking[0] contains the PetriNet and net_and_marking[1] contains the Marking.

Besides the typical plugins you already know from the ProM GUI, there are also plugins for loading files and saving files. Just browse the list of available plugins to find the right type.

I suggest to store the list of available plugins in a separate plugin_list.txt file for easier searching using the following command

ProM_CLI.bat -l > plugin_list.txt

Now, you basically know everything to invoke ProM from the commandline.

  1. Create the ProM_CLI.bat or ProM_CLI.sh
  2. Run the ProM PackageManager to install your desired plugins. If you run the PackageManager for the first time, it will suggest a set of standard packages to install which cover most process mining use cases.
  3. Get the list of available plugins.
  4. Write a script.
  5. Invoke the script.

Known Caveats

The ProM CLI is not the primary user interface of ProM and as such does not get the same attention to usability as the GUI. Thus, it is better to consider the CLI an experimental feature where not everything works as you know it from the GUI and that may sink a bit of time and effort to get running. Several factors you should consider:

  • You only can use plugins from the CLI which have been programmed to work without the GUI. Whether your favourite plugin is available depends on two aspects:
    1. Does the plugin require configuration of parameters for which no good default settings are available, so user feedback is requires (for example particular log filtering options)?
    2. Did the developer of that plugin have the time to implement a non-GUI version? We encourage users to first develop the non-GUI version of the plugin and introduce GUI-reliant components only later. However as ProM is an open platform with many contributing parties, individual developers may choose otherwise. If a particular plugin is not available on the CLI, please get in touch with the developer whether this can be changed.
  • The CLI cannot invoke any code that requires the ProM GUI environment. Any plugin that attempts to do that on the side will terminate the CLI with an exception. That being said, you actually can invoke visualizer plugins that produce a JComponent and then create a new JFrame to visualize the JComponent, see below for an example. However, the functionality of these will be limited (e.g. export of pictures, interaction with the model etc. most likely won’t work)
  • It may be that ProM CLI does not terminate/close after the script completes. Workaround: include a System.out.println(“done.”); statement at the end of your script to indicate termination. When you see the “done.” line printed on the screen but ProM is still running, you can terminate it manually (CTRL+C) without loosing data.
  • Log files may not be (g)zipped, i.e., the CLI can only load plain XES or MXML files.
  • PNML files produced by a mining algorithm in the CLI have no layout information yet. If you want to visualize such a PNML file, you have to open it in a tool that can automatically compute the layout of the model. Opening the file in the ProM GUI will do. Invoking a the plugin to visualize a Petri net will also invoke computation of the layout.
  • The plugins to load a file or save a file are named rather inconsistently across the different packages. You may have to look for various keywords like “load”, “open”, “import”, “save”, “export” to find the right load/save plugin.
  • Plugins to load a file always come with a signature that take a String parameter as the path to the file to load. Plugins to save a file always require a File parameter. Thus, you first have to create a file handle myFile = new File(pathToSave); and then pass this handle to the “save file plugin”.
  • Files are read/written relative to the current working directory.
  • Even if the plugins you want to use are available for the CLI, executing them may throw exceptions because the plugin (although accessible from the CLI) assumesĀ  settings that can only be set correctly in a GUI dialog. The only workaround here is to extend your script with Java code that produces all the settings expected by the plugin. See below for more advanced examples.
  • Creating these scripts is certainly on the less convenient side of development. You have no development environment with syntax check, code completion etc. In case your script has an error, you will only notice at runtime when you get a long evaluation error thrown which tries to highlight the problematic part of the script, but is typically hard to spot. You’ve been warned.

Advanced Examples

With all these restrictions in mind. Here are some more advanced scripts to get more advanced ProM plugins running. The following script invokes the HeuristicsMiner with its default settings. It needs some additional code to properly pass the event classifiers to the heuristics miner. The HeuristicsMiner typically produces nice results on real-life data because it does not use a standard process modeling notation as target language. As a consequence, there is no serialization format. However, you can invoke the visualization plugin and pass it to a new JFrame to visualize the result. File script_heuristics_miner.txt

System.out.println("Loading log");
log = open_xes_log_file("myLog.xes");

System.out.println("Getting log info");
org.deckfour.xes.info.XLogInfo logInfo = org.deckfour.xes.info.XLogInfoFactory.createLogInfo(log);

System.out.println("Setting classifier");
org.deckfour.xes.classification.XEventClassifier classifier = logInfo.getEventClassifiers().iterator().next();

System.out.println("Creating heuristics miner settings");
org.processmining.plugins.heuristicsnet.miner.heuristics.miner.settings.HeuristicsMinerSettings hms = new org.processmining.plugins.heuristicsnet.miner.heuristics.miner.settings.HeuristicsMinerSettings();
hms.setClassifier(classifier);

System.out.println("Calling miner");
net = mine_for_a_heuristics_net_using_heuristics_miner(log, hms);

System.out.println("Visualize");
javax.swing.JComponent comp = visualize_heuristicsnet_with_annotations(net);
javax.swing.JFrame frame = new javax.swing.JFrame();
frame.add(comp);
frame.setSize(400,400);
frame.setVisible(true);

System.out.println("done.");
heuristics_miner_gui

Heuristics Miner run from the Command Line, visualizing the output in a new JFrame.

 

If you want to change the parameters of the HeuristicsMiner, you can do this via the HeuristicsMinerSettings object. However, here I have to refer you to the source code of the HeuristicsMiner package to study the details of this class. See https://svn.win.tue.nl/trac/prom/ as a starting point.

If you prefer to create a process model in a serializable format out of a HeuristicsNet, simply change your script to invoke another plugin to translate the heuristics net into a Petri net. Below is the script that will also save the resulting Petri net as PNML file to disk. File script_heuristics_miner_pn.txt:

System.out.println("Loading log");
log = open_xes_log_file("myLog.xes");

System.out.println("Getting log info");
org.deckfour.xes.info.XLogInfo logInfo = org.deckfour.xes.info.XLogInfoFactory.createLogInfo(log);

System.out.println("Setting classifier");
org.deckfour.xes.classification.XEventClassifier classifier = logInfo.getEventClassifiers().iterator().next();

System.out.println("Creating heuristics miner settings");
org.processmining.plugins.heuristicsnet.miner.heuristics.miner.settings.HeuristicsMinerSettings hms = new org.processmining.plugins.heuristicsnet.miner.heuristics.miner.settings.HeuristicsMinerSettings();
hms.setClassifier(classifier);

System.out.println("Calling miner");
net = mine_for_a_heuristics_net_using_heuristics_miner(log, hms);

System.out.println("Translating to PN");
pn_and_marking = convert_heuristics_net_into_petri_net(net);

System.out.println("Saving net");
File net_file = new File("mined_net.pnml");
pnml_export_petri_net_(pn_and_marking[0], net_file);

System.out.println("done.");

The last example I will show in this blog post is a script to invoke the very reliable InductiveMiner with default parameters which includes some basic noise handling capabilities. The resulting Petri net is written to disk. File script_inductive_miner_pn.txt

System.out.println("Loading log");
log = open_xes_log_file("myLog.xes");

System.out.println("Creating settings");
org.processmining.plugins.InductiveMiner.mining.MiningParametersIMi parameters = new org.processmining.plugins.InductiveMiner.mining.MiningParametersIMi();

System.out.println("Calling miner");
pn_and_marking = mine_petri_net_with_inductive_miner_with_parameters(log, parameters);

System.out.println("Saving net");
File net_file = new File("mined_net.pnml");
pnml_export_petri_net_(pn_and_marking[0], net_file);

System.out.println("done.");

Take Away

It is possible to run process mining analyses in a more automated form using scripts and the ProM CLI interface. Many, but by far not all, plugins are available to run in a non-GUI context. The scripts are non-trivial and may require knowledge of the plugin code to prepare correct plugin settings etc. Luckily, all plugins are open source and ready to be checked. Start here: https://svn.win.tue.nl/trac/prom/

If you were wondering, yes, that’s how we run automated tests for ProM.

For those, who prefer a less experimental environment for automated process mining analysis, I highly recommend ProM integration with RapidMiner available at: http://www.rapidprom.org/

Feel free to post further scripts. In case you have problems with running a particular plugin, I suggest to contact the plugin author to make the plugin ready for the CLI environment.

tutorial : eclipse rcp e4 with 3.x views like project explorer, properties, etc.

This tutorial shows step-by-step how to add classical Eclipse 3.x views like Project Explorer and the Properties View to an Eclipse e4 Rich Client Platform (RCP) Application.

Eclipse 4.2 was released a few weeks ago with the new e4 platform that basically uses an EMF model to describe how your application looks and feels. While this is neat, the caveat is that e4 currently (August 2012) does not provide views for a number of core features of the old Eclipse 3.x platform including things like the Project Explorer, a Properties View and the like.

Eclipse 4.2 comes with a “Compatibility Layer” that makes all the old features that were not ported yet to e4 available. However, I could not find a site that tells me how to use it. There are a few quite instructive tutorials out there on how RCP applications with e4 work, for instance http://www.vogella.com/articles/EclipseRCP/article.html. But none showed me how to add a Project Explorer and all the IDE views that I needed for my projects (http://service-technology.org/seda and http://service-technology.org/greta in case you want to know). So I went for the usual approach in the Eclipse-verse: trial and error. Here is what I figured out and works for me.

Continue reading

gmf knowledge: positioning of external labels

This post is the first in a series of posts where I write down what I have found out about the Eclipse Graphical Modeling Framework (GMF) by hard work, deep digging in the web and trial-and-error. These posts are meant to complement the existing documentation of GMF. I hope that this knowledge helps others in solving their problems faster than me.

This post (or say tutorial if you like) treats the definition and positioning of external labels of nodes and arcs (or connections) in a GMF diagram. For the rest of this post we assume that you are familiar with the basic GMF workflow of deriving an editor from an .ecore file. If not, you should start with the official GMF Tutorial first and come back when you are done.

1. First you need a node class and an arc class in your Ecore meta model, each having a property which you would like to display as a label of this node.

A node and an arc class with two properties.

We define two classes Node and Arc. We are not concerned with the references between these classes here. They could be completely unrelated for the purpose of defining a label. We will use attribute name of class Node and attribute weight of class Arc as properties we want to display as labels. A sample attribute specification could be the following.

Properties specification of attribute \

2. Next you need an appropriate graphics definition. Open your .gmfgraph file. I assume you defined a Figure Descriptor for your node or Arc. For an external node label, create a new Figure Descriptor, e.g. NodeNameFigure that only contains a Label, e.g. NodeNameLabel, and a Child Access to this label. We have to create this separate Figure Descriptor to make sure that our label is not internal to the graphics of our node. To label an arc (a connection), just add a Label and a Child Access to the corresponding Figure Descriptor of the Arc, e.g. ArcFigure.

Figure Descriptor definitions for the labels

Now we need to declare the labels themselves. Create a Diagram Label for the node, e.g. nodeName and set its Figure to the Figure Descriptor for the label that you’ve just created, e.g. NodeNameFigure. Done. This automatically makes this label an external label.

Diagram Label definition for an external label

For the arc’s label also create a Diagram Label, e.g. arcWeight in our case; set its Figure to the Figure Descriptor of the arc’s figure, e.g. ArcFigure and set the Accessor to the corresponding Child Accessor, e.g. getFigureWeight. To control the distance of the label of the arc’s label to the arc, add a Label Offset Facet. Set its attributes X and Y to the distance you would like the label have from its arc. Done.

Diagram Label definition for the label of a connection

3. Now comes the mapping of graphics to the meta model. Open your .gmfmap file of your project. The relation between the graphics and meta model is defined by Feature Label Mappings. Again I assume that you’ve already defined Node Mappings and Link Mappings for your classes, we now add their labels.

For the node, find the corresponding Node Mapping. In our example this a node mapping for a Transition class which specializes the earlier Node class. Hence the Transition class also has the name attribute. At the Node Mapping, add a Feature Label Mapping and set the following properties: Diagram Label is the one you’ve just defined for your node class, e.g. nodeName. In Features choose the class attributes you want to display in the label, one attibute is suffcient; in our case this is name. Done.

Feature Label Mapping definition of an external label

For the arc, find the corresping Link Mapping, again in our example we specialized class Arc to class ArcToPlace; hence it also has the weight attribute. To add the label, add a Feature Label Mapping and set the corresponding properties, this time use the label defined for the arc.

Feature Label Mapping definition for a label of an arc.

4. Create the .gmfgen file. You can now refine the positioning of the arc label. Locate it in the tree. It will be in Gen Diagram <YourDiagram> / Gen Link <YourLink>EditPart / Gen Link Label <YourLink><Attribute>EditPart. Its sub-features control the distance of the label to the link (that you’ve just set above). The property Diagram Label / Alignment of the Gen Link Label allows you to specify where the label shall be positioned along the link (beginning, middle, end).

Gen Link Label definition of an arc label

Unfortunately, such a refined definition is not available for external labels of nodes. You have to do this in the source code (see below).

5. Generate the source code. Your arc label is now fine, but you can fine touch your node label, specifically the distiance of the external node label to its node. Open the Edit Part source file of your node in <your-package>.diagram.edit.part, e.g. TransitionEditPart.java . Now locate the method addBorderItem() in this class. It is responsible for registering the external label within a certain distance at your node’s graphics. The statement
locator.setBorderItemOffset(new Dimension(-20, -20));
borderItemContainer.add(borderItemEditPart.getFigure(), locator);

specifies the distance of your label. Replace (-20,-20) by the values you like. I prefer (-5,-5) for a close distance.

That’s it. Run your editor and enjoy your labels.