Evolving past reference type patterns

Sat Apr 16 15:34:55 UTC 2022

> On Apr 15, 2022, at 10:10 PM, Guy Steele <guy.steele at oracle.com> wrote:
> 
> That said, I am always (or at least now) a bit leery of language designers motivating a new language feature by pointing out that it would make a compiler easier to write. As I have learned the hard way on more than one language project, compilers are not always representative of typical application code. (Please consider this remark as only very minor pushback on the form of the argument.)

Indeed, this is something to be vigilant for.  In fact, one could make this observation about pattern matching in entirety!  Pattern matching is a feature that all compiler writers love, because compilers are mostly just big tree-transforming machines, and so *of course* compiler writers see it as a way to make their lives easier.  (Obviously other programmers besides compiler writers like it too.)  

So, let me remind everyone why we are doing pattern matching, and it is not just “they’re better than visitors.”  We may be doing this incrementally, but there’s a Big Picture motivation here, let me try to tell some more of it.  

Due to trends in hardware and software development practices, programs are getting smaller.  It may be a cartoonish exaggeration to say that monoliths are being replaced by microservices, but the fact remains: units of deployment are getting smaller, because we’ve discovered that breaking things up into smaller units with fewer responsibilities offers us more flexibility.  Geometrically, when you shrink a region, the percentage of that region that is close to the boundary goes up.  

And so a natural consequence of this trend towards smaller deployment units is that more code is close to the boundary with the outside world, and will want to exchange data with the outside world.  In the Monolith days, “data from the outside world” was as likely as not to be a serialized Java object, but today, it is likely to be a blob of JSON or XML or YAML or a database result set.  And this data is at best untyped relative to Java’s type system.  (JSON has integers, but they’re not constrained to the range of Java’s int, etc.)  

At the boundary, we have to deal with all sorts of messy stuff: IO errors, bad data, etc.  But Java developers want to represent data using clean, statically typed objects with representational invariants.  In a Big Monolith, where most of the code is in the interior, it was slightly more tolerable to have big piles of conversion code at the boundary.  But when all the code lives a short hop from the boundary, our budget for adaptation to a more amenable format is smaller.  Records and sealed types let us define ad-hoc domain models; pattern matching lets us define polymorphism over those ad-hoc data models, as well as more general ad-hoc polymorphism.  Records, sealed types, and pattern matching let us adapt over the impedance mismatch between Java’s type system and messy stuff like JSON, at a cost we are all willing to pay.  

And it extends beyond simple patterns like we’ve seen so far; the end goal of this exercise is to use pattern matching to decompose complex entities like JSON blobs in a compositional manner — essentially defining the data boundary in one swoop, like an adapter that is JSON-shaped on one side and Java-shaped on the other. (We obviously have a long way to go to get there.)