[External] : Re: Declared patterns -- translation and reflection

Wed Mar 30 10:11:16 UTC 2022

> From: "Brian Goetz" <brian.goetz at oracle.com>
> To: "Remi Forax" <forax at univ-mlv.fr>
> Cc: "amber-spec-experts" <amber-spec-experts at openjdk.java.net>
> Sent: Wednesday, March 30, 2022 2:47:45 AM
> Subject: Re: [External] : Re: Declared patterns -- translation and reflection

>>> The mangling has to be stable across compilations with respect to any source-
>>> and binary-compatible changes to the pattern declaration. One mangling that
>>> works quite well is to use the "symbolic-freedom encoding" of the erasure of
>>> the pattern descriptor. Because the erasure of the descriptor is exactly as
>>> stable as any other method signature derived from source declarations, it will
>>> have the desired binary compatibility properties, overriding will work as
>>> expected, etc.
>> I think we need a least to use a special name like <deconstructor> the same way
>> we have <init>.

> Yes. Instance/static patterns will have names, so for them, we'll use the name
> as declared in the source. Dtors have no names, just like ctors, so we have to
> invent something to stand in for that. <dtor> or similar is fine.
Pattern methods (static or not) does not have a real name, so '<' and '>' are here to signal that the name is in the Pattern attribute. 
We do not want people to unmangle the name of pattern methods that why the name is in the attribute, using '<' and '>' signal that idea. 

As a war story, most of the IDEs try to decode nested class name trying to making sense of the names in between the '$' and tend to throw exceptions when they encounters classes a different patterns that the one generated by javac. 

But we may not care given that not a lot of people read the bytecode directly. 

I think John can help us here :) 

>> I agree that we also need to encode the method type descriptor (the carrier
>> type) into the name, so the name of the method in the classfile should be
>> <deconstructor+mangle> or <name+mangle> (or perhaps <pattern+name+mangle> ofr
>> the pattern methods).

> The key constraint is that the mangled name be stable with respect to compatible
> changes in the declaration. The rest is just "classfile syntax."
yes. 

>>> #### Return value

>>> In an earlier design, we used a pattern object (which was a bundle of method
>>> handles) as the return value of the pattern. This enabled clients to invoke
>>> these via condy and bind method handles into the constant pool for
>>> deconstruction and static patterns.

>>> Either way, we make use of some sort of carrier object to carry the bindings
>>> from the pattern to the client; either we return the carrier from the pattern
>>> method, or there is a method on the pattern object that we invoke to get a
>>> carrier. We have a few preferences about the carrier; we'd like to be able to
>>> late-bind to the actual implementation (i.e., we don't want to freeze the name
>>> of a carrier class in the method descriptor), and at least for records, we'd
>>> like to let the record instance itself be the carrier (since it is immutable
>>> and we can just invoke the accessors to get the bindings.)
>> So the return type is either Object (too hide the type of the carrier) or a
>> lambda that returns an Object (PatternObject or PatternCarrier acting like a
>> glorified lambda).

> If the pattern method actually runs the match, then I think Object is right. If
> the method returns a constant bundle of method handles, then it can return
> something like PatternHandle or a matcher lambda. But I am no longer seeing the
> benefit in this extra layer of indirection, given how the other translation
> work has played out.
I agree, Object is enough. 

>>> Pattern {
>>> u2 attr_name;
>>> u4 attr_length;
>>> u2 patternFlags; // bitmask
>>> u2 patternName; // index of UTF8 constant
>>> u2 patternDescr; // index of MethodType (or alternately UTF8) constant
>>> u2 attributes_count;
>>> attribute_info attributes[attributes_count];
>>> }

>>> This says that "this method is a pattern", reifies the name of the pattern
>>> (patternName), reifies the pattern descriptor (patternDescr) which encodes the
>>> types of the bindings as a method descriptor or MethodType, and has attributes
>>> which can carry annotations, parameter metadata, and signature metadata for the
>>> bindings. The existing attributes (e.g. Signature, ParameterNames, RVAA) can be
>>> reused as is, with the interpretation that this is the signature (or names, or
>>> annos) of the *bindings*, not the input parameters. Flags can carry things like
>>> "deconstructor pattern" or "partial pattern" as needed.
>> From the classfile POV, a constructor is a method with a funny name in between
>> brackets, i think deconstructor and pattern methods should work the same way.

> Be careful of extrapolating from one data point. Dtor are only one form of
> declared patterns; we also have to accomodate static and instance patterns.
see above, it's about signaling that the name is mangled. 

Rémi