New pattern matching doc

Thu Jan 14 21:48:37 UTC 2021

I hear you on this, I have had some concerns about this too.  But also, 
after looking at typical client code of APIs like JSONP, getting good 
error messages there, while possible, is also pretty rare.  Far more 
common is not checking errors at all, and just assuming that the key is 
present, mapped to an integer, the result will surely parse with 
Integer.parseInt, etc.  In those cases you get an exception, whose stack 
trace might point you to the right line number, but you're not really 
getting validation there either.

If you use an XPath-like API, you are more likely to get a sensible 
error (that the path didn't lead to what you expected), because you're 
basically handing a checkable schema to the library, but use of XPath in 
Java is rare and XPath-like stuff for JSON in Java is even more rare.

So while I worry that using complex patterns to extract lots of goop 
from a JSON document could turn into a debugging nightmare ("document 
failed to match"), I worry even more that what we do today isn't even 
correct (in addition to being painful), because the programming model 
rarely retains the information with which to check the result, and even 
when it does, is often hard to get the checks and order right, even with 
good intentions. And if you get it right, the code is a nightmare to 
read and maintain.

Stepping back, why am I trying to apply pattern matching here?  Not 
because it's cool (though it is).  It's because code at the boundary 
(which describes more code, as programs get smaller) have to deal with 
untyped, semi-structured or unstructured data from the outside world 
(JSON, XML, HTML, SQL result sets, etc.)  That leaves Java developers 
with a few choices:

  - Program in a bad, dynamically-typed dialect of Java, where 
everything is a String or List or Map or Object.  I think we can agree 
that we don't want to encourage this.
  - Use a schema-driven tool for translating from the external 
representation to Java types, such as JAXB, O/R mappers, etc.  This 
works OK when your schemae are under your control and stable, but 
doesn't deal will with schema changes (let alone "no schema"), and often 
has high performance costs (because of eager, expensive full 
translations of the data in each direction).
  - Use a parsing library, where you pick apart an input document in an 
ad-hoc manner to extract the bits you want.  This is pretty common, but 
is unpleasant, verbose, error-prone, and hard-to-maintain.  The number 
of error cases you have to handle scales with the number of navigation 
points and extractions.  80% of your code is paying tribute to the 
parsing library, and usually the output is still a relatively lightly 
typed bag of variables.

None of these choices are so happy.  My observation here is that (a) 
parsing and pattern matching have a lot in common (structural test + 
conditional extraction) and (b) much of the pain associated with the 
third option comes from the lack of composition.  If we had a 
test+conditional extract, that composed cleanly, which could deal with 
unstructured data on input, and output clean strongly typed Java 
variables, then this would be much more pleasant.

I think your concern amounts to "well, that might be better than what we 
have now, but today's problems come from yesterday's solutions, so 
tomorrow's probably will be one of visiblity into why something 
failed."  Right?  And, unlike exposing this as a parsing library, where 
you could pass in an error context that could accumulate debug 
information, there is no obvious clean way to do this in the language.

> The issue, as I see it, is that I'm not entirely sure if a failure to
> match in such a large nested structure is going to help me construct
> a usable _error message_. As you're certainly aware, about 80% of the
> code in any good compiler is devoted to giving error messages that are
> actually useful to users. If I get a parse error, for example, I want to
> know - down to the level of lines and columns - which part of the input
> failed to match expectations.
>
> Is matching a structure like that going to be able to provide useful
> error messages if input _doesn't_ match? It seems like it just provides
> a binary true/false answer. If it's the case that it won't actually
> help with giving useful error messages, then I think that reduces the
> applicability of patterns to this particular class of problems. It
> follows that it also might mean that the nice things we're putting on
> top (such as the composition of patterns) won't actually see practical
> use, because people end up writing very simple patterns with at most
> one level of nesting.
>
> Now, you know me, I'm the first to try to apply pattern matching and
> algebraic data types to any and every problem. I'm a little concerned
> about possible over-engineering though.
>
> [0]https://github.com/openjdk/amber-docs/blob/master/site/design-notes/pattern-match-object-model.md#a-possible-approach-for-parsing-apis
>