chymyst-core

Other work on Join Calculus

Here are other implementations of Join Calculus that I was able to find online.

The code of Chymyst is a clean-room implementation of join calculus, not based on the code of any of the previous work. The chosen syntax in Chymyst aims to improve upon the design of He Jiansen’s ScalaJoin as well as building on the experience of implementing CocoaJoin / AndroJoin.

Improvements with respect to He Jiansen’s ScalaJoin

Compared to ScalaJoin (Jiansen He’s 2011 implementation of JC), Chymyst Core offers the following improvements:

Chymyst Core:

val a = m[Int]
val c = b[Int, Int]
site(
  go { case a(x) + c(y, reply) 
    a(x + y)
    reply(x)
  }
)
a(1)

ScalaJoin:

object join1 extends Join {
  object a extends AsyName[Int]
  object c extends SynName[Int, Int]

  join {
    case a(x) and c(y) 
      a(x + y)
      c.reply(x)
  }
}
a(1)

Improvements with respect to JoCaml

As a baseline reference, the most concise syntax for JC is available in JoCaml, which uses a modified OCaml compiler:

def a(x) & c(y) =  
   a(x+y) & reply x to c
spawn a(1)

In the JoCaml syntax, molecule emitters a and c are declared implicitly, together with the reaction. Implicit declaration of molecule emitters (“channels”) is not possible in Chymyst because Scala macros do not allow us to insert a new top-level name declaration into the code. So, molecule declarations need to be explicit and show the types of values (b[Int, Int] and so on). Other than that, Chymyst’s syntax is closely modeled on that of ScalaJoin and JoCaml.

Another departure from JoCaml is that in Chymyst, reactions can declare any number of repeated input molecules, as well as repeated blocking molecules. The novel reply syntax disambiguates the destination of the reply value:

go { case a(x1, reply1) + a(x2, reply2)  reply2(x1); reply1(x2) }

This reaction is impossible to express in JoCaml directly.

Other tutorials on Join Calculus

This book can be seen as a pragmatic, non-theoretical introduction to practical use of Join Calculus, written for programmers.

Previously I wrote a tutorial for JoCaml. (However, be warned that the old JoCaml tutorial is unfinished and contains mistakes in some of the more advanced code examples.)

See also my recent presentation at Scalæ by the Bay 2016. (Talk slides are available). That presentation covered an early version of Chymyst.

There are a few academic papers on Join Calculus and a few expository descriptions, such as the Wikipedia page on Join Calculus or the JoCaml documentation. Unfortunately, I cannot recommend reading these sources: they are unsuitable for learning about the chemical machine / Join Calculus paradigm.

I first found out about the “Reflexive Chemical Abstract Machine” from the introduction in one of the early papers on Join Calculus.

Do not start by reading academic papers if you never studied Join Calculus - you will be unnecessarily confused. All Join Calculus papers I’ve seen are written in an obscure jargon and are intended for advanced computer scientists.

As another comparison, here is some code in academic join calculus notation, taken from this tutorial:

def newVar(v0) def put(w) etc.

This code creates a shared value container val with synchronized single access.

The equivalent Chymyst code looks like this:

def newVar[T](v0: T): (B[T, Unit], B[Unit, T]) = {
  val put = b[T, Unit] 
  val get = b[Unit, T]
  val vl = m[T] // have to use `vl` since `val` is a Scala keyword
  
  site(
    go { case put(w, ret) + vl(v)  vl(w); ret() },
    go { case get(_, ret) + vl(v)  vl(v); ret(v) }
  )
  vl(v0)
  
  (put, get)
}

I also do not recommend reading the Wikipedia page on Join Calculus. As of August 2018, the Wikipedia page says this about Join Calculus:

The join-calculus … can be considered, at its core, an asynchronous π-calculus with several strong restrictions:

However, as a language for programming, the join-calculus offers at least one convenience over the π-calculus — namely the use of multi-way join patterns, the ability to match against messages from multiple channels simultaneously.

This explanation is impossible to understand unless you are already well-versed in the research literature. (For instance, I’m not sure what it means to have “communication on defined names”; for instance, how could we communicate on undefined names…?)

Academic literature on Join Calculus typically uses terms such as “channel” and “message”, which are not helpful for understanding how JC works and how to write concurrent programs in JC. Indeed, a “channel” in JC holds an unordered collection of messages, rather than an ordered queue or mailbox, as the word “channel” may suggest. Another metaphor for “channel” is a persistent pathway for sending and receiving messages, but this is also far from what a JC “channel” actually does.

The word “message” suggests that a mailbox or a queue receives messages and processes them one by one. This is very different from what happens in JC, where a reaction may wait for several “messages” at once, and different reactions may contend on several “messages” they wait for. “Messages” in JC are better understood as specially labeled and managed data that has been made available for concurrent computations at a dedicated storage location (“reaction site”).

The JoCaml documentation is especially confusing as regards “channels”, “messages”, and “processes”. It confuses a “process” as a fragment of code running on some CPU thread, and a “process” as the side-effect of emitting a molecule that adds data to a reaction site.

For these reasons, the JoCaml documentation is ill-suited as a pedagogical introduction to using join calculus or JoCaml. Given the lack of tutorial-level documentation, it is not surprising that the JC paradigm remains unknown to the programming community.

Instead of using academic terminology, I always follow the chemical machine metaphor and terminology when talking about Chymyst programming. Here is a dictionary:

Chemical machine Academic Join Calculus Chymyst code
input molecule message on a channel case a(123) ⇒ ... // pattern-matching
molecule emitter channel name val a: M[Int]
blocking emitter synchronous channel val q : B[Unit, Int]
reaction process val r1 = go { case a(x) + ... ⇒ ... }
emitting a molecule sending a message a(123) // side effect
emitting a blocking molecule sending a synchronous message q() // returns Int
reaction site join definition site(r1, r2, ...)

Why the industry ignores Join Calculus

I conjecture three principal reasons for this:

  1. Lack of readable tutorial documentation about the new paradigm: Almost all available documentation on Join Calculus is 100% incomprehensible to ordinary software practitioners.
  2. Lack of industry-strength implementations in mainstream languages: The only well-maintained implementation (JoCaml) is in a language that is mostly used for university teaching. Apart from Chymyst, all other existing implementations are unmaintained academic projects. There is no industry-strength implementation of JC that provides facilities for unit testing, message-level debugging or logging, performance tuning, error recovery, or low-level thread control. (As an example, a facility to terminate the concurrent computations is not present in any implementations of JC apart from Chymyst.)
  3. Lack of a developed set of design patterns for concurrent programming in JC: Most articles and books on JC are limited to academic or toy examples, and there is no documented systematic development of design patterns for implementing concurrency tasks such as barriers, rendez-vous, critical sections, map/reduce, fork/join, asynchronous pipelines with backpressure, throttling, cancellation, timeouts, and so on.

In addition, there is no well-understood Join Calculus model for distributed computations, persistence, or consensus across a network of machines, that would compare, say, with Akka’s persistence and cluster features.

Referee criticism on the Chymyst paper from 2017

I submitted a paper on Chymyst to the Scala’2017 ACM conference. The draft was rejected with the following comments.

Referee A

It is a good idea that someone tends to a more practically useful JC implementation. In order to become more popular, JC needs compelling real-world use cases beyond academic examples.

You may want to compare against another recent JC implementation in Scala [1]. Moreover, it would be interesting if you could explain what kind of implementation strategy is used and how it scales. For that matter, the algorithm from [2] may be relevant. For “Modern Concurrency Abstractions for C#” there is a TOPLAS paper that presumably supersedes the ECOOP paper.

Detailed comments:

Sec 1:

Sec 2:

[1] http://dl.acm.org/citation.cfm?id=2577082

[2] https://dl.acm.org/citation.cfm?id=2076021.2048111

Referee B

Overall Chymyst looks like a well-designed project, and I like that it offers a way to explore programming with the join calculus in Scala that is easily accessible. Before publication, however, some aspects of the paper should be revised and improved:

1) The paper provides almost no details on some core aspects of the implementation. On p.6, for instance, the process search is only hinted at, while it seems to be one of the most important aspects of the system. We also learn about the existence of macros to implement static analysis without any further explanations. These aspects are sure to be of high interest to readers of the paper and attendees. For instance, we don’t know if the support for non-linear patterns comes for an implementation technique.

2) One of the main claims is that the library is ‘industry-strength’. While the bullet point list on p.6 is an excellent reference, a) not all points are in fact available in Chymyst, and, more importantly, b) the paper only presents very simple examples, and makes no reference to the existence of larger programs or benchmarks.

3) Generally, I found the comments on pedagogical aspects to be lacking in supportive facts. It is perfectly possible that the “visually suggestive terminology” of molecular reactions is better suited for teaching about the join calculus, but this should only be presented as a fact if it can be backed. Some opinions are stated unsubstantiated:

Misc. points:

Summary:

Pros:

Cons:

Referee C

This is a good paper which describes a library that seems quite useful. I’m particularly happy about the discussion of pedagogy – how we teach complex concepts is sadly under-discussed.

The major limitation of this paper is the lack of evaluation at any level. Have any applications been built with this library, that could be described? What about experience with the teaching approach? Or performance numbers? Or an evaluation of what features of JoCaml can be expressed by Chymyst? Any or all of those would make the claims of the paper easier to credit, and demonstrate the contributions more fully.

A few smaller issues: