Sunday, April 20, 2008

Sequence and Order in Erlang

I’ll be at ThoughtWorks Chicago headquarters for this weeks’ Away Day gathering. One of my slides is about the order of evaluation and message sequencing in Erlang. I like to think of it as an "Erlang Puzzler". For anyone who won’t be there, take a look at this 'contrived' module. It is composed of three functions: sequence/0, maketuple/0 and echo/0.

sequence() ->
Pid = spawn(fun echo/0),
register(echoprocess, Pid),
whereis(echoprocess) ! maketuple(self(), now()),
whereis(echoprocess) ! maketuple(self(), now()).

echo() ->
{From, {A,B,C}} ->
From ! {A,B,C},

maketuple(Self, Now) -> {Self, Now}.
What does this do? The sole public sequence function spawns the self documented echo function as an Erlang process. It then registers this process as “echoprocess” and sends it two messages. Both messages are return values of maketuple, which simply yields its two arguments as a tuple.

There are a few subtleties here.

Strict Evaluation

Pay attention to the two calls to maketuple. The arguments applied to maketuple are calls to built-in functions self/0 and now/0. Are these arguments evaluated before or after maketuple is evaluated? In a functional programming language like Haskell the calls to self and now would be delayed until the value of the arguments were needed within maketuple. This means self or now could start after the call to maketuple, finish before it, or possibly never be evaluated. This is called lazy evaluation. Function arguments in Erlang however are strictly evaluated, meaning self and now are evaluated before maketuple.

Referential Transparency

What is the order of evaluation for self and now? It is natural for an experienced imperative programmer to assume self precedes now. Some would say this is in observation of the principle of least astonishment.

The order of evaluation for self and now is undefined. In a deterministic world without side effects, the order of evaluation is inconsequential - just like the commutative property of addition in grade school.

Order of Evaluation for Send Operands

The operands of the send operator (!) are calls to the built-in function whereis/1 is and maketuple. Which is called first, whereis or maketuple? Left to right? Right to left? Again, the order is undefined.

Delivery Order of Asynchronous Messages

The sequence function sends two messages. In which order are the received? An undefined order would really complete the process/actor/mailbox metaphor. When you send two letters in the mail, is the recipient guaranteed to read each letter in the order they were sent? Of course not. An undefined order of delivery would also sit well with many other asynchronous communication mediums.

It turns out the order of message delivery is in fact guaranteed to be the order in which they were sent. Despite the various possible execution paths of the sequence function, things are pretty static on the other side in the echoprocess. This is not to say that the second message will be delivered immediately after the first message - there could in theory be messages from other processes in the mix.

1 comment:

Ulf Wiger said...

There are actually several situations where the message ordering guarantee is extremely useful, but the only guarantee given is that all messages from A to B will arrive in the same order as they were sent. If messages are sent from or via different processes, the arrival order is undefined.

One might note, though, that even in the first case, the sender cannot assume that the recipient actually processes the messages in the order in which they were sent. Selective receive makes it possible for the receiver to "re-order" the messages.