how to always read facebook’s news feed in “most recent first” order

Facebook has been tampering with its news feed design over the last weeks and months on its mobile apps and also the website. The “most recent” order is no longer a default view on the app (but several taps away hidden in a sub-sub-menu) as of version 10.0.0. On the default facebook page, the feed regularly switches back to “top news” every 1 or 2 weeks, even if you choose “most recent”. The option to choose the sort order of the news feed also disappeared from the mobile website (that you can reach when opening facebook.com on a mobile browser).

You can still change the ordering of the news feed on the Desktop page and, once changed, the mobile website will inherit the settings. But, I don’t like to change the sort order every 2 weeks again from the Desktop page.

Now, I just saw that facebook stores the sort order in the URL. You can use the following hard links to reach the news feed in the desired ordering:

  1. https://www.facebook.com/?sk=h_chr to read the news feed in chronological order “most recent first”, and
  2. https://www.facebook.com/?sk=h_nor to read the news feed in random order (“top news”).

I’ve placed this URL as a bookmark on the home screen of my smartphone and removed facebook’s mobile app. It actually loads much faster than the mobile app and I now read the feed in the preferred order. Let’s see how long these links work.

Is my log big enough for process mining? Some thoughts on generalization

I recently received an email from a colleague who is active in specification mining (software specifications from execution traces and code artifacts) with the following question.

Do you know of process mining works that deal with the confidence one may have in the mined specifications given the set of traces, i.e., how do we know we have seen “enough” traces? Can we quantify our confidence that the model we built from the traces we have seen is a good one?

The property the colleague asked about is calledĀ generalization in process mining. As my reply to this question summarized some insights that I gained recently, I though it was time to share this information further.

A system S can produce a language L. In reality we only see a subset of the behaviors that is recorded in a log K. Ideally, we want that a model M that was discovered from K can reproduce L. We say that M generalizes K well if M also accepts traces from L that are not in K (the more, the better).

This measure is in contradiction with 3 other measures (fitness, precision, and simplicity) as the most general model that accepts all traces is not very precise (M should not accept traces that are not in L). These measures are described informally in The Need for a Process Mining Evaluation Framework in Research and Practice (doi: http://dx.doi.org/10.1007/978-3-540-78238-4_10) and the paper On the Role of Fitness, Precision, Generalization and Simplicity in Process Discovery (doi: http://dx.doi.org/10.1007/978-3-642-33606-5_19) shows how they influence each other in practice.

There is currently no generic mathematical definition to compute for a given log K that it was general enough (contains enough information to infer entire L from K). This usually depends on the algorithm, the kind of original system S/the language L, and the kind of model one would like to discover.

The most general result that I am aware of is that the log K has to beĀ directly follows-complete. Action A of the system S directly follows action B if there is some execution trace …AB… of S where first A occurs and then directly B occurs. The log K is directly follows-complete iff for any two actions A, B that directly follow each other, there is a trace …AB… in L. Every system S with a finite number of actions has a finite log K that is directly follows-complete, even if the language L of S is infinite. For such logs, there are algorithms that ensure that S (or a system that is trace-equivalent to S) can be rediscovered. See for instance Discovering Block-Structured Process Models from Event Logs – A Constructive Approach (doi: http://dx.doi.org/10.1007/978-3-642-38697-8_17).

In general, if you have the original system S (or some finite characterization of L), then it will be possible to compute whether the log K is directly follows-complete. If you do not have S or L, then we currently do not know any means to estimate how complete K is. This is an open question that we are currently looking into. Yet, in essence, you have to estimate the probabilities that the information in log K suffices to explain particular languages constructs that the original system has/may have.

We are currently looking deeper into these questions. If you have more pointers on this topic, feel free to drop a comment or an email.

Mining Branching-Time Scenarios From Execution Traces

Over the last two years, I have been working with Shahar Maoz and David Lo on discovering high-level specifications of an application from its execution traces. This topic is also known as specification mining and has been brought up around 2002 because we we keep on writing humongous amounts of undocumented code that other people have to use, maintain, or deal with in a later iteration of the software. Getting into this “legacy” code is extremely time consuming and there is a high chance that you will break something because you do not understand how the existing code works.

Specification mining aimes at extracting (mostly) visual representations of existing code that describes its essential architecture and workings at a higher level of abstraction, so that a developer can first get an overview before diving into the code.

We looked particularly at the problem of understanding object interplay in object-oriented code, where you often have many objects that invoke methods of other objects, essentially distributing a behavior over many classes. Everyone who ever tried to understand code that uses a number of architectural patterns combined, such as factories and commands, knows what I mean.

Building on earlier works, we found a way to discover scenarios (something like UML sequence diagrams) that show how multiple objects interact with each other in particular situation. More importantly, we found a way to discover and distinguish two kinds of scenarios:

  • scenarios that describe behavioral invariants of your application (whenever the user presses this button, the application will open that window), and
  • scenarios that describe behavioral alternatives (when the user is on this screen, she may continue with this, that, or even that command)

These two scenarios combined give a good understanding of how an application works at a high level of abstraction. Here is the presentation showing you how it works:

You can also get the full paper, our tool for discovering scenarios, and the data set that we used in our evaluation.

Drop me an email or a comment if you find this useful and/or would like to know more.

 

some historical statistics on modeling formalisms

Some historical statistics about the use of formalisms for modeling automated systems since 1800. Here is a chart about how frequently a particular name of a modeling formalism was used in literature. I’ve picked Automata, Petri nets, Process algebra, Statecharts, and UML.

You can get the full chart (and add your own favorite formalism) using Google Ngram. What I find surprising is that Petri nets were at some point as relevant in literature as automata (which have been discussed since the 1800s already). I’m not surprised that UML peaks all of them by far. On second throught also UML’s decline is not suprising as the hype returns to normal levels. What I do find surprising is that process algebras are much less referenced in literature than even the very particular, though successful, technique of statecharts.

 

proper interface descriptions for your service

A service is, in computer science terms, a functionality (of a software or of a device), that hides its implementation details to the user. To be able to use the service, the service has to declare what the service does at its interface. In the old days, you would get a manual, in the new days you would descriptions in some fancy service description language.

Besides discussions how to write down what a service does, I feel there should also be some thoughts about what all should be described about a service. It’s quite easy to miss something important as the following video from our new building shows.

Using a Coffee Machine Service

15 minutes for everyone

Aside

If every person living on earth today wanted his/her 15 minutes of unrivaled fame, it would take ~193778 years from now. We should get started.

Every day 96 people can have their 15 minutes of fame. That means for an estimated 6.79 billion people living on earth today, we need about 70,729,167 days to get everyone famous. That’s just about 193778 years.

from the other side of the review form: why papers get rejected and what the BPM community can do about it

I’ve been reviewing papers for the International Conference on Business Process Management (BPM) over the last 3 or 4 years, in 2011 and 2012 as a member of the Program Committee. In that time my evaluations of submitted research papers have been to reject the paper in the very vast majority of cases. This year the best score I gave was a borderline on one paper, and a reject on all other papers (8 in total), other years were a bit better, but not much. In the following I’d like to share my view on why the papers were rejected. Paper authors may find this interesting to learn how they can improve their chances of getting a paper accepted. Perhaps more importantly, it also sheds a light on publishing standards within BPM research and what the BPM community as a whole can do to promote these standards.

Continue reading