Read: Towards native higher-order remote procedure calls

By: Jeremy W. Sherman. Published: 10 Aug 2015. Categories: reading-notes.

I do a lot of reading on my iPhone 6+. It’s amazing what you can accomplish reading only a couple pages a day. I write notes in either Day One or Logsit, depending on how long the work is. They don’t do anyone much good buried in there, so I’m sharing my notes.

Fredrikson, Olle, Dan R. Ghica, and Bertram Wheen. “Towards native higher-order remote procedure calls.” IFL 2014. 12 pages.

Introduces Floskel, a Haskell-like call-by-value language with ADTs and location-aware expressions (expr @ nodeN). Compiles down to abstract machine byte code. Authors start with a CES machine (SECD without the dump) and then step gradually through adding a heap (CESH) then adding distributed communication (DCESH), first in a trivial single-node case and then in the general multiple-node, actually distributed case. They finish up with an async single-threaded network of machines with fail-stop fault tolerance via backups.

Each step from one machine to the next advances by a bisimulation proof formalized in Agda. It’s interesting to note that the details of the compilation from the surface syntax down to the bytecode are not formalized; in fact, they’re considered so well-known and commonplace that they’re entirely glossed over.

The compilation assumes a fixed number of nodes. All code is also known in advance. Since all nodes have the same code, each can call a remote function on another node just by sending a pointer to the function. A node also knows its node number. (They implemented this by compiling to C and using MPI.)

Benchmarks show it does well next to OCaml for the single-node case, and RPC overhead doesn’t hurt much.

New things I learned:

The two-level operational semantics for the network was new to me, though apparently it’s bog standard in the literature:

Both kinds of networks [synchronous and asynchronous] are modelled by two-level transition systems, which is common in operational semantics for concurrent and parallel languages. A global level describes the transitions of the system as a whole, and a local level the local transitions of the nodes in the system. (p. 6, “3.3 Network models”)

This makes sense - it lets you glue together a bunch of independent machines with some over-arching system behavior, and that’s roughly how you’d expect to naïvely approach describing how a network of independent machines acts. But that’s not something I’d ever tried to do formally, so, new to me!
They modeled the sync case via rendezvous, where both nodes must be ready to send/receive before the communication happens.
They modeled the async case as having a pool of in-flight messages; receiving a message corresponded to fishing one out of the “message soup”.
Lambda lifting has some concrete meaning to me now.

For compilation, we require that the terms t in all location specification sub-terms t @ i are closed. Terms where this does not hold are transformed automatically using lambda lifting [25] (transform every sub-term t @ i to tʹ = ((λ fv. t) @ i) (fv t)). (p. 7 “3.5 DCESH: The distributed CESH machine”)

Put into words, if you have an expression with free variables, lambda lifting is when you convert the expression to a function where all the formerly free variables are now arguments to the function. The function no longer has free variables, and implicit binding has been replaced by explicit binding by way of function application.
I also learned what a bisimulation is: think “isomorphism for labelled transition systems”.
- A subtle point made on Wikipedia is that a bisimulation is different than just being able to establish different simulations in both directions (x can sim y and y can sim x).
- I cribbed from Wikipedia for an intro to transition systems, too. A labelled transition system itself is a generalization of a finite state machine where the states aren’t necessarily countable, the transitions aren’t necessarily countable, and there’s no notion of initial or accepting states.
  
  You can view it as an abstract rewriting system (like how you’d graph out reduction of a lambda calculus expression), only the focus is on the transition labels (interpreted as actions/events) rather than the objects at either end of the transitions (like the classic “do we reach a normal form” question).

Unfortunately, I’d forgotten what some of those things referred to. iBooks’ delightful sync erased this paper from my phone’s library, along with my bookmarks, at some point. Lucky thing this is such a short paper that I was able to locate the context easily, but I take it I should be quicker to write up my notes in future!