Building Systems for Composition

The promise of Object Oriented languages was that we would be able to use Objects to interconnect and compose into larger and more complex systems.

A canonical example is the Person hierarchy:

class Person {
  def getName: String
  def getAddress: Address
  def getAge: Int
}

class Student extends Person {
  def getCurrentClasses: List[Clazz]
  def getGrades: List[Grades]
}

class Teacher extends Person {
  def getStudents: List[Student]
  def getClasses: List[Clazz]
}

We've got people that can be either students or teachers. But what about student-teachers? What about folks who have keys to a building?

Even in a trivial example, we start getting into situations where the class hierarchies do not compose, so we need to add interfaces and deal with the challenge of keeping the implementation of the interfaces up to date across all the classes that implement the given interface.

This is all nasty stuff and leads to code that cannot be easily composed or maintained.

The hidden problem

There's an additional hidden problem in most OO design. Because instances of a given class cannot create copies with just one piece of data changed, most OO systems are designed around changing instances in-place… or mutating the data in a given object rather than creating a copy with just one value changed.

This leads to a huge problem when you try to distribute a computation problem across multiple threads in a single address space. There must me some assertion of ownership of object references (locks) that are granular enough to allow multiple threads to act on the data, but not fine grained enough to have deadlock issues or a high cost of acquiring and releasing locks.

So, in addition to lack of composition, the traditional OO paradigm does not do well in concurrent situations.

The callback issue

In Java, it's a common pattern to have a call-back interface. Something like:


interface Foo {
  def computeSomething(in: SomeType): SomeOtherType
}

Java code is littered with these callbacks. In Java 1.1, we even got anonymous inner classes so that we could write the simple callbacks in-place without having to explicitly define a class to handle the issue.

Wouldn't it be nice to have a single interface that could be used (with the right input and output types) any place that you have one of these call-back interfaces?

Functional & Composition

Many of us are familiar with Unix pipes. They are simple, powerful ways of composing transformation operations on data (usually text). Utilities like sed, awk, grep, etc. are tools form transforming input to output.

We can use this transformation model for a whole lot of different kinds of data. And it turns out that the transformation model (rather than the mutation model) allows you to compose pipelines of transformation that have the same signature as individual elements of the pipeline.

So, if we define a function that takes a String and returns a String as String => String and a function that takes a String and returns an Int as String => Int, we can also compose the functions into a function that takes a String and returns an Int:

def f1: String => String
def f2: String => Int
def composed: String => Int = f1 andThen f2

You can even compose a whole list of functions into a single function that represents the transformation pipeline:

def lof: List[String => String]
def composed: String => String = lof.foldLeft(s => s)((a, b) => a andThen b)

It's about the operations, not the data

A key tenet of OO is to combine the data with the operations that can take place on the data. Classes map the call signature (name & parameters) to the function.

By divorcing the code operations from the data, we get code operations that can be re-used a lot more effectively than OO classes.

Average age of students in a class

If we wanted to be absurd, we could create a special class that's a ListOfStudents. It's just like a regular list, except that it only takes students and it has a method that will give us the average age of the students in a given class.


class ListOfStudents extends List[Student] {
  def averageAgeByClass(cl: Clazz): Int = {
    var cnt = 0
    var sum = 0

    foreach(s => if (s.getClasses.contains(cl)) {cnt += 1; sum += s.getAge})

    sum / cnt
  }
}

So, we've re-written the average function (badly) and we have a special class that's just a container for students.

What if we made generic functions for filtering by class and for averaging across a List of items that can be converted into an Int.


def genericFilter[A, B](f: A => List[B],b: B)(lst: List[A]): List[A] = lst.filter(a => f(a).contains(b))
def average[A](f: A => Int)(lst: List[A]): Option(Int) = 
  if (lst.isEmpty) None else Some(lst.map(f).reduceLeft(_ + _) / lst.length)

We've built a semi-generic filter and a very generic (and zero-count safe) average function.

How do we compose them?


def studentAgeByClass(cl: Clazz, lst: List[Student]): Option[Int] = 
 (genericFilter(_.getCurrentClasses, cl) andThen average(_.getAge)).apply(lst)

Note how we created partially applied functions (functions that had some, but not all of their parameters supplied), composed those functions, and applied those functions to our data set.

This form of composition allows us a lot more flexibility is defining blocks of logic and then combining those blocks together. When it comes to the specifics of applying our functions, we need only create simple functions (like convert a Student to an Int) that bridge between the logic of the generic function and the specifics of our data.

More

In a future blog post, I will explore Scala's partial functions (functions that are defined for some values of a type) and how they are used in Lift to compose into global HTTP request handlers.