Record pattern, the runtime side

Wed Mar 16 20:34:36 UTC 2022

----- Original Message -----
> From: "Brian Goetz" <brian.goetz at oracle.com>
> To: "Remi Forax" <forax at univ-mlv.fr>, "amber-spec-experts" <amber-spec-experts at openjdk.java.net>
> Sent: Wednesday, March 16, 2022 5:41:49 PM
> Subject: Re: Record pattern, the runtime side

>> It works in 3 steps:
>> Step 1, at compile time, the compiler takes all the patterns and creates a tree
>> of pattern from the list of patterns,
>> pattern that starts with the same prefix are merged together.
> 
> We can "normalize" a complex pattern into a sequence of simpler
> conditionals.  For example, matching the record pattern
> 
>     case Circle(Point(var x, var y), var r)
> 
> can be unrolled (and type inference applied) as
> 
>     x matches Circle(Point(var x, var y), var r)
>     === x matches Circle(Point p, int r) && p matches Point(int x, int y)
> 
> Deconstruction patterns are known to have only an `instanceof`
> precondition; the deconstructor body won't ever fail (unlike more
> general static or instance patterns like Optional::of.)

If you define "matches" in term of instanceof this transformation does not work in the context of an assignment,
because you want
  Point(var x, var y) = null
to throw a NPE.

But it's a very valid transformation if the pattern is not total and "matches" means instanceof in the context of a switch or instanceof and requireNonNull + cast in the context of an assignment.

Also from the runtime POV, a deconstructor and a pattern methods (static or instance) are identical, if we follow the idea of John to use null for not match. Obviously, it does not preclude us to differentiate between the two at the language level.

> So we can further rewrite this as:
> 
>     x matches Circle(Point(var x, var y), var r)
>     === x matches Circle(Point p, int r) && p matches Point(int x, int y)
>     === x instanceof Circle c && c.deconstruct(Point p, int r) && p
> instanceof Point && p.deconstruct(int x, int y)
> 
> (where the "deconstruct" term invokes the deconstructor and binds the
> relevant values.)

With this rewrite, we are moving from higher world of patterns to the lower world of matcher (it's how i've called it), were the exact semantics of a pattern is decomposed into different method handles. It's at that stage that depending on if the pattern is total or not, an instanceof/type check is used or not.

> 
> If we are disciplined about the order in which we unroll (e.g., always
> depth-first and always left-to-right), with a modest amount of
> normalization, your "same pattern prefix" turns into the simpler "common
> prefix of normalized operations". 

Yes !

> Record deconstructors can be further normalized, because the can be replaced with calling the accessors:
> 
>     x matches Circle(Point(var x, var y), var r)
>     === x matches Circle(Point p, int r) && p matches Point(int x, int y)
>     === x instanceof Circle c && (Point p = c.center()) && (int r =
> c.radius()) && p instanceof Point
> && (int x = p.x()) && (int y = p.y())

I've unified the access to a record and the access to a carrier for that exact reason, so the translation for a deconstructor or a record is identical.
A carrier being an anonymous record or it's dual, a record is a named carrier.

> 
> Of course, this is all very implementation-centric; factoring matching
> this way is somewhat unusual, since the arity and order of side-effects
> might be surprising to Java developers.  (Yes, having side-effects in
> pattern definitions is bad, but it may still happen.)  So the spec will
> have to do some fast dancing to allow this.

yes, this will require pedagogy to explain why a pattern method is called once and not as many time as present in the source code.
But i believe we should not make the performance worst because few users may write pattern methods/deconstructors that side effect otherwise we will have people that rewrite switch to a cascade of if ... instanceof due to performance concern (as we have seen several examples of this in the past with people re-writing enhanced for loops to basic loops before escape analysis was added to the VM, or more recently with lambdas because before Java 17 each lambda was using its own metaspace).

> 
>> In the end, the tree of patterns is encoded in the bytecode as a tree of
>> constant dynamic (each Pattern is created only from constant and patterns).
> 
> With record patterns, we don't even need pattern descriptions, because
> we can translate it all down to instanceof tests and invoking record
> component accessors.  Of course, that ends when we have deconstruction
> patterns, which correspond to imperative code; then having a Pattern
> instantiation, and a way to get to its matching / binding-extraction
> MHs, is needed.

Yes, record pattern is the last pattern we can implement in term of a cascade of if. I use that fact in the prototype, the runtime use the switch + type pattern because it does not use invokedynamic.

For the future, i'm not sure we will want to use invokedynamic for all patterns, indy is still quite slow until c2 kicks in.

Rémi