experiences in serialization

The short of it: use Jerkson — you can easily serialize case classes:

case class Author(name : String) {}
case class Document(title : String, data : String, authors : List[Author]) {}
...
val encoded = Json.generate(doc)
val decoded = Json.parse[Document](encoded)
decoded should (equal(d))

I’ve been working some more with Scala, and found that I needed to serialize some data (I’m working with Hadoop).

Unfortunately, as with most things in life, I’ve been presented with all too many choices:

I first tried Avro, given that it’s part of the Hadoop project so I thought it would be the most seamless way of getting things workly.

Sadly, this was not to be. Avro generated classes do not support the Writable type for Hadoop, allowing them to be dropped in. Instead, you’re required to change all of your mappers/reducers to take in AvroKey/AvroValue wrapped items, and to set your output/input via the AvroInput/OutputFormats. This, while tedious, would be fine, except that I hit a Scala compiler bug when trying to get it all working.

My other thought was to simple convert the Avro object instances into strings myself, and then output strings from Hadoop. Hacky, but, hey it would work.

Except: After digging through the Avro documentation I couldn’t find a way of just turning my Avro structures into a serialized string. I could send them off to another server via an RPC, but dumping them to a file myself was out of the question. Sigh.

A note to serialization designers: please, please, please — give me an easy way to turn an object into a string.

I then started puttering around with the various JSON projects for Scala. Since there is no standard way of doing it, there are a lot of various cobbled together options that I had to try before finding out one that worked. Jerkson, despite the odd name, “just works”. I specify my objects as case classes, and magic, they can be serialized.

So now I’ve gone from outputting nice typed structures from Hadoop to just dumping strings and interpreting them myself. But I’m okay with that – it works.

Leave a Reply

Your email address will not be published. Required fields are marked *