Choosing Scala (Somewhat Reluctantly)

Scala is the implementation language of choice on the next big project I'll be working on. For me, that's a tiny bit of a disappointment, as I've been slowly building up my knowledge, not of Scala, but of Haskell. But the high-level software at the observatory is predominantly Java, so that's a trump card for Scala.

Like everyone, we are looking at a future of vastly larger datasets (in astronomy, we have 14 billion years of legacy data) and know that concurrency management using the Java language's threading model is essentially unworkable -- it's far more difficult than "just" the problem of memory management in non-garbage-collected languages. So we need a language that gives us at least a fighting chance for highly distributed concurrent data manipulation.

Scala is one such language, and there is a bias towards strong typing at the observatory (personally, I have a slight bias towards strong typing, but have to say that when writing Scala that interacts with Java legacy code, there's still an annoying large amount of "finger typing").

For me, something that's interesting about this choice is that, at the language level, Scala would not be my first choice as a functional language. Haskell has some exceptional learning resources:

  • Erik Meijer's Channel 9 series of lectures: I'm an unrepentant Meijer fanboy, but even if you're not, I think these 13 lectures are an amazing free resource
  • Real-World Haskell : When I read this for the Jolt Awards a few years ago, I labeled it the best language tutorial book I'd read since Practical Common Lisp
  • Learn You A Haskell For Great Good: This is a new book, but I'm going to be talking it up during this year's Jolt judging, as I think it's even better than Real-World Haskell. In my opinion, if you're going to be teaching functional programming, you have to push the functional concepts from the very beginning; just as Grady Booch's classic Object-Oriented Design with Applications put OO front-and-center for a generation of structured programmers. "Learn You A Haskell..." does that, although it falls short of Booch in marrying the concepts to large practical applications or even the small but complete and useful programs that Dr. Dobb's Andrew Binstock correctly advocates in his recent column Lax Language Tutorials.

Another thing I fear about Scala (and would also fear about F#), is that the two major managed runtimes (the Java Virtual Machine and .NET's Common Language Runtime) may end up being drags on the languages. The Virtual Machines of those platforms embody certain object-oriented principles and are (intentionally) not as flexible as native hardware when it comes to stack and heap manipulation. Plus, neither the JVM nor the CLR reifies a 21st-century concurrency model. In short, from a back-end compiler perspective, I think you're probably better with a blank slate than with either of those managed platforms.

Finally, since I guess the gist of this blog post has become my enumeration of risks, I am not sure that Scala syntax is as good for writing parsers as it could be. Admittedly, opportunities to write parsers are quite rare in mainstream programming, but it happens to be something I find wildly enjoyable (tremendously frustrating in the moment, but tremendously rewarding when you break through). Scala is clearly much better than mainstream languages, but if you're going to include pattern-matching in the language and parser combinators in the standard library, it seems a pity if, e.g., newline syntax, has to be worked-around.

Having said all that, I am pushing to develop a decent level of competence in Scala and hope to post from time-to-time experience reports and code samples. Stay tuned and please feel free to jeer from the sidelines...


DISCLAIMER: This blog is my personal writing and represents my own undoubtedly biased, retrospectively embarrassing, and generally ill-informed opinions and not those of Gemini Observatory or anyone else who works there.