Black Pepper Blog

The thoughts and musings of our team

Tag >> Clojure
Following on from yesterday's Clojure session, I was interested to attend Rich Hickey's second one on Persistent Data Structures and Managed References. He'd touched briefly on this topic and it whet my appetite. This talk wasn't specifically on Clojure, but on how one can use immutable data structures, separation of identity and value and managed references to solve many issues that concurrent programs face.

The key to concurrent software design is immutable data structures. If a data structure is immutable then it must always be in a consistent state. Multiple threads can never see a partial update since the values never change. If one considers the number 42, it never changes, it is always 42. Similarly if we think about a composite value, for example the date "12th March 2009", it also never changes. It makes no sense to set the month to be "January". There is another date, "12th January 2009", but this is distinct from the first one.

So, values are immutable, whether they are primitives or composite values.

Functions in functional programming (and indeed really in any form of software) work on values and derive new values. For example, the function "add" takes two immutable values and evaluates to a third one that is the sum of these two. It always produces the same output for the same input, it doesn't matter when it is called, in what order it is called in relation to other functions, it always does the same thing. It is an ideal construct for building highly concurrent programs.


I've done some work with Erlang and was impressed by its concurrency model and I'm completely sold on the idea of immutable objects, however, the JVM is an excellent platform which because of Java's prevalence in the marketplace just keeps getting better and better. So it was with interest that I attended the session on Clojure.

Clojure is a lisp dialect designed to run on the JVM.

I was particularly interested in its use of references to manage concurrency. All data structures are immutable, so they can never be in an inconsistent state. However one needs to be able to "update" state as a program runs. Effectively this means replacing the "value" (as complex as that may be) that is "identified" by the reference. So concurrency in this scenario is managing the process where new instances of the data structures are set as the value when multiple clients want to do this at once. Clojure has four types of references:

Refs, these use software transactional memory to ensure co-ordination of updates of the references. Agents, these are asynchronous processes that defer the update until a convenient point in time. Atoms, these synchronously update the references and repeatedly retries in the event of a conflict. Finally, vars, which are isolated so that they are only accessible to the local process.