I’ve got several code-bases using squeryl to manage relational data models. It seems to work well in squeryl to model the database tables as case classes, linking tables together via foreign keys modelled as properties of type Long. To distinguish ID fields, I suffix them with _id, and either name them after their role in that association, or simply by using the target table name. My problem is that despite naming conventions, sometimes I get my IDs mixed up.

I’m going to illustrate this with a simple data model for capturing marriages.

val male = 1
val female = 2
case class Person(id: Long, name: String, sex: Int)
extends KeyedEntity[Long]

case class Marriage(id: Long, partner_1_id: Long, partner_2_id: Long)
extends KeyedEntity[Long]

val people = table[Person]
val marraiges = table[Marriage]

Now, let’s model the recent royal marriage:

// The recent royal marriage
val william = people.insert(Person(0L, "William", male))
val kate = people.insert(Person(0L, "Kate", female))
val williamAndKateAreMarried =
marriages.insert(Marriage(0L, william.id, kate.id))

// Now for something we should not be able to do but can
val janeDoe = people.insert(Person(0L, "Jane", female))
val deluded =
marriages.insert(Marriage(0L, janeDoe.id, williamAndKateAreMarried.id))

In the first case we’ve done everything right. In the next one perhaps Jane Doe want’s to be the third member of this royal marriage, but what we’ve written is nonsense according to the relational model. If we are lucky, the RDBMS will choke. More likely, the ID of the royal marriage will happen to fall within the range of the IDs used for people, and the error will be silent. This smells like the sort of error that should be prevented using types, but exactly what types can we use to make these IDs more specific?

Tagged types to the rescue

What we need is strongly-typed numerical IDs. In the experimental branch of scalaz there’s some code that offers a solution, in the form of Tagged types. It doesn’t seem to have made it into trunk yet, so for now I’ve copied the idea over. The code, after I’ve butchered it a bit, looks like this.

object Tag {
// Unboxed newtypes, credit to Miles Sabin.
type Tagged[T] = {type Tag = T}
type @@[T, Tag] = T with Tagged[Tag]

@inline def apply[A, T](a: A): A @@ T = a.asInstanceOf[A @@ T]
@inline def apply[T](a: Long): Long @@ T = a.asInstanceOf[Long @@ T]

@inline def untag[A, T](a: A @@ T): A = a.asInstanceOf[A]
@inline def untag[T](a: Long @@ T): Long = a

implicit def ordTag[A, T](implicit aOrd: Ordering[A]): Ordering[A @@ T] = Ordering.by(untag[A, T])
}

We can now re-work our data objects to make use of these tagged types.

case class Person(id: Long @@ Person, name: String, sex: Int)
extends KeyedEntity[Long @@ Person] {
def this(name: String, sex: Int) = this(Tag[Person](0L), name, sex)

case class Marriage(id: Long @@ Marriage,
partner_1_id: Long @@ Person,
partner_2_id: Long @@ Person)
extends KeyedEntity[Long @@ Marriage] {
def this(partner_1_id: Long @@ Person, partner_2_id: Long @@ Person) =
this(Tag[Marriage](0L), partner_1_id, partner_2_id)
}

Now, all of the IDs are type-safe Longs. It’s no longer possible to use a Person ID where a Marriage ID should go, or vica-versa. I’ve chosen to add an extra constructor so that people can make new instances not yet in the database without needing to worry about initializing the ID.

// The recent royal marriage
val william = people.insert(Person("William", male))
val kate = people.insert(Person("Kate", female))
val williamAndKateAreMarried =
marriages.insert(Marriage(william.id, kate.id))

//  this now barfs
val janeDoe = people.insert(Person("Jane", female))
val deluded =
marriages.insert(Marriage(janeDoe.id, williamAndKateAreMarried.id))

That final deluded marriage is now not type-safe. The compiler will complain that Long @@ Marriage can’t be coerced to Long @@ Person, and our stalker doesn’t get to corrupt our database.

Putting it to work

This is great. It  prevents a whole range of mistakes. What else is it good for? Well, when doing ORMish things, I often want to transparently convert backwards and forwards between instances and their IDs. After all, in a given scope the ID and the instance are 1:1 so should be interchangeable. However, when IDs where Longs, there wasn’t much that I could do. Now that they are typed, we can use scala’s implicit magic to do the work for us.

implicit def keyedEntityToId[T <: KeyedEntity[KeyType @@ T]]
(t: T): KeyType @@ T = t.id

When this implicit is in scope, we can use an instance where ever an ID of that type is needed. This allows us to simplify our royal weding to:

// The recent royal marriage
val william = people.insert(Person("William", male))
val kate = people.insert(Person("Kate", female))
val williamAndKateAreMarried =
marriages.insert(Marriage(william, kate)) // no need for .id

What about the other way? Well, let’s say we had a cache of people by their IDs.

implicit def personCache(pID: Long @@ Person): Person = ...
val wife: Person = williamAndKateAreMarried.partner_2_id

In this context, the wife will be looked up in the cache using the id for the 2nd partner in the royal marriage.

Perspectives

Tagged types have made my ORM code far more explicit, at the cost of more characters to type and read. Applying it to existing code-bases has actually uncovered a number of silent errors, so it’s certainly worth doing. I did try aliasing the tagged type in the appropriate companion object as ID so that we can use Person.ID in place of Long @@ Person, but that caused the compiler to get itself into a twist over circular references – perhaps this is fixed in releases later than 2.9.1, but I’ve not checked. I’ve found that the implicit lookup trick is nice, but in read/write situations it is important to scope the cache appropriately. Perhaps I should add cache implementations that are tied to transactions or sessions or some other scoping, but I don’t have enough experience to say much more about this.

This is definitely an approach that I’d recommend to others. Hopefully soon scalaz7 will be out and you won’t have to roll your own solution, or alternatively, a more aptly-named version of the same trick could be incorporated directly into squeryl and perhaps wrapped up in some analogue of KeyedEntity.

1. Dave Whittaker says:

Very cool Matt. I was wondering what the heck the @@ represented in the messages you had posted to the Squeryl list.

