New pattern matching doc

Brian Goetz brian.goetz at oracle.com
Fri Jan 15 17:02:41 UTC 2021


Maurizio had a clever idea for how to rescue this issue.

First, let's recognize that there are a lot of different ways to use 
patterns.  There are situations when we just want to pick off a special 
case:

     if (m instanceof EnumMap) {
         // special implementation
     }
     else {
         // general implementation
     }

Here, there's no error message; if it's the special kind, do the special 
thing, otherwise do the general thing.

There are situations where we are expecting one of a number of choices:

     switch (message) {
         case HelloMessage(String username):
         case GoodbyeMessage(int sessionId):
         case PingMessage(String pingText):
         ...
     }

Here, the kind of error message we'd issue is more a catch-all: 
"unrecognized message type FooMessage", and this is fine.

In the case you are worried about, where we have a deeply nested / 
chained all-or-nothing pattern, such as parsing a JSON document, that's 
going to bind a lot of stuff, there's an aspect of assertion; "I am 
assuming that this document has this (implicit) structure."  (That's 
certainly how most such code is written.)  So maybe what we're missing 
is another control construct, where we're asserting that all the 
conditional matches are expected to succeed, and if they don't, it's an 
error, and we should get an exception.  As in (not a serious syntax 
proposal):

     try-match (x : big-honking-pattern) {
         // happy-path code
     }
     catch (MatchFailException e) { // checked exception
         // exception captures which sub-pattern failed
     }

Essentially, this code is saying that we assert that the pattern 
matches, and it's an error not to.  This gives the compiler permission 
to turn match failure into an exception rather than an ordinary "it 
didn't match."




On 1/14/2021 4:48 PM, Brian Goetz wrote:
> I hear you on this, I have had some concerns about this too.  But 
> also, after looking at typical client code of APIs like JSONP, getting 
> good error messages there, while possible, is also pretty rare. Far 
> more common is not checking errors at all, and just assuming that the 
> key is present, mapped to an integer, the result will surely parse 
> with Integer.parseInt, etc.  In those cases you get an exception, 
> whose stack trace might point you to the right line number, but you're 
> not really getting validation there either.
>
> If you use an XPath-like API, you are more likely to get a sensible 
> error (that the path didn't lead to what you expected), because you're 
> basically handing a checkable schema to the library, but use of XPath 
> in Java is rare and XPath-like stuff for JSON in Java is even more rare.
>
> So while I worry that using complex patterns to extract lots of goop 
> from a JSON document could turn into a debugging nightmare ("document 
> failed to match"), I worry even more that what we do today isn't even 
> correct (in addition to being painful), because the programming model 
> rarely retains the information with which to check the result, and 
> even when it does, is often hard to get the checks and order right, 
> even with good intentions. And if you get it right, the code is a 
> nightmare to read and maintain.
>
> Stepping back, why am I trying to apply pattern matching here? Not 
> because it's cool (though it is).  It's because code at the boundary 
> (which describes more code, as programs get smaller) have to deal with 
> untyped, semi-structured or unstructured data from the outside world 
> (JSON, XML, HTML, SQL result sets, etc.)  That leaves Java developers 
> with a few choices:
>
>  - Program in a bad, dynamically-typed dialect of Java, where 
> everything is a String or List or Map or Object.  I think we can agree 
> that we don't want to encourage this.
>  - Use a schema-driven tool for translating from the external 
> representation to Java types, such as JAXB, O/R mappers, etc. This 
> works OK when your schemae are under your control and stable, but 
> doesn't deal will with schema changes (let alone "no schema"), and 
> often has high performance costs (because of eager, expensive full 
> translations of the data in each direction).
>  - Use a parsing library, where you pick apart an input document in an 
> ad-hoc manner to extract the bits you want.  This is pretty common, 
> but is unpleasant, verbose, error-prone, and hard-to-maintain.  The 
> number of error cases you have to handle scales with the number of 
> navigation points and extractions.  80% of your code is paying tribute 
> to the parsing library, and usually the output is still a relatively 
> lightly typed bag of variables.
>
> None of these choices are so happy.  My observation here is that (a) 
> parsing and pattern matching have a lot in common (structural test + 
> conditional extraction) and (b) much of the pain associated with the 
> third option comes from the lack of composition.  If we had a 
> test+conditional extract, that composed cleanly, which could deal with 
> unstructured data on input, and output clean strongly typed Java 
> variables, then this would be much more pleasant.
>
> I think your concern amounts to "well, that might be better than what 
> we have now, but today's problems come from yesterday's solutions, so 
> tomorrow's probably will be one of visiblity into why something 
> failed."  Right?  And, unlike exposing this as a parsing library, 
> where you could pass in an error context that could accumulate debug 
> information, there is no obvious clean way to do this in the language.
>
>
>
>> The issue, as I see it, is that I'm not entirely sure if a failure to
>> match in such a large nested structure is going to help me construct
>> a usable _error message_. As you're certainly aware, about 80% of the
>> code in any good compiler is devoted to giving error messages that are
>> actually useful to users. If I get a parse error, for example, I want to
>> know - down to the level of lines and columns - which part of the input
>> failed to match expectations.
>>
>> Is matching a structure like that going to be able to provide useful
>> error messages if input _doesn't_ match? It seems like it just provides
>> a binary true/false answer. If it's the case that it won't actually
>> help with giving useful error messages, then I think that reduces the
>> applicability of patterns to this particular class of problems. It
>> follows that it also might mean that the nice things we're putting on
>> top (such as the composition of patterns) won't actually see practical
>> use, because people end up writing very simple patterns with at most
>> one level of nesting.
>>
>> Now, you know me, I'm the first to try to apply pattern matching and
>> algebraic data types to any and every problem. I'm a little concerned
>> about possible over-engineering though.
>>
>> [0]https://github.com/openjdk/amber-docs/blob/master/site/design-notes/pattern-match-object-model.md#a-possible-approach-for-parsing-apis 
>>
>>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/amber-spec-experts/attachments/20210115/3d5284ce/attachment.htm>


More information about the amber-spec-experts mailing list