Deconstructor reflection Was: Re: Deconstruction patterns

Tue Mar 7 19:51:59 UTC 2023

Some questions about the reflective design in line below.

>
>
> #### Carriers
>
> Because the matcher methods implements the matcher behavior, but a
> matcher may
> "return" multiple bindings (or failure), we must encode the bindings in
> some
> way.  For this, we use a _carrier object_.  The choice of carrier is
> largely a
> footprint/specificity tradeoff.  One could imagine a carrier class per
> matcher,
> or a carrier class per matcher descriptor, or using `Object[]` as a
> carrier for
> everything, or caching some number of common shapes (e.g, three ints and
> two
> refs).  This sort of tuning should be separate from the protocol encoded
> in the
> bytecode of the pattern method and its clients.
>
> We use a small _carrier runtime_ to decouple pattern translation from
> carrier
> selection.  (This same carrier runtime is used by string templates as
> well.)
> This allows tradeoffs in runtime characteristics (e.g., carrier per
> matcher vs
> sharing carriers across matchers, dropping carrier identity with value
> types
> later, etc) without affecting the translation. The carrier API consists
> of condy
> bootstraps like:
>
> ```
> static MethodHandle carrierFactory(MethodType matcherDescriptor) { ... }
> static MethodHandle carrierAccessor(MethodType matcherDescriptor, int
> bindingNo) { ... }
> ```
>
> The `matcherDescriptor` is a `MethodType` describing the binding types.
> The
> `carrierFactory` method returns a method handle which takes the bindings
> and
> produces a carrier object; the `carrierAccessor` method returns method
> handles
> that take the carrier object and return the corresponding binding.  To
> indicate
> success, the matcher method invokes the carrier factory method handle and
> returns the result; to indicate failure (deconstructors cannot fail, but
> other
> matchers can) the matcher method returns null.
>
> We would translate the XY deconstructor from `Point` as follows
> (pseudo-code):
>
> ```
> #100: MethodType[(II)V]
> #101: Condy[bsm=Carriers::carrierFactory, args=[#100]]
>
> final synthetic Object Point$MANGLE() {
>      aload_0
>      getfield Point::x
>      aload_0
>      getfield Point::y
>      LDC #101
>      invokevirtual MethodHandle::invoke(II)V
>      areturn
> }
> ```
>
> Constant `#100` contains a `MethodType` holding the binding descriptor;
> constant
> `#101` holds a method handle whose parameters are the parameter types of
> the
> binding descriptor and returns `Object`.
>
> At the use site, matching a deconstruction pattern is performed by
> invoking the
> matcher method on the appropriate target object, and then extracting the
> components with the carrier accessor method handles if the match is
> successful.
> (Deconstructors are total, so are always successful, but for other
> patterns,
> null is returned from the matcher method on failure to match.)
>
> #### Method names
>
> The name of the matcher method is mangled to support overloading. The JVM
> permits overloading on parameter types, but not return types (and
> overloaded
> matchers are effectively overloaded on return types.)  We take the
> approach of
> encoding the erasure of the matcher descriptor in the name of the
> pattern.  This
> has several desirable properties: it is stable (the name is derived
> solely from
> stable aspects of the declaration), for matchers with override-equivalent
> signatures (deconstructors can't be overridden, but other patterns can be),
> these map to true overrides in the translation, and valid overloads of
> matchers
> will always have distinct names.
>
> We use the ["Symbolic Freedom"]() encoding of the erasure of the matcher
> descriptor as the mangled disambiguator, which is exactly as stable as
> any other
> method descriptor derived from source declarations.
>
> #### Attributes
>
> Because patterns are methods, we can take advantage of all the
> affordances of
> methods.  We can use access bits to control accessibility; we can use the
> attributes that carry annotations, method parameter metadata, and generics
> signatures to carry information about the pattern declaration (and its
> (input)
> parameters, when we get to those kinds of matchers).  What's missing is
> the fact
> that this is a pattern implementation and not an ordinary method, and a
> place to
> put metadata for bindings.  To address the first, we can add the following
> attribute on matcher methods:
>
>      Matcher {
>          u2 name;                            // "Matcher"
>          u4 length;
>          u2 patternFlags;
>          u2 patternName;                     // UTF8
>          u2 patternDescr;                    // MethodType
>          u2 attributes_count;
>          attribute_info attributes[attributes_count];
>      }
>
> This says that "this method is a pattern".  The source name of the pattern
> declaration is reified as `patternName`, and the matcher descriptor, which
> encodes the types of the bindings, is reified as a `MethodType` in
> `patternDescr`.  The `flags` word can carry matcher-specific information
> such as
> "this matcher is a deconstructor" or "this matcher is total".
>
> A matcher method may have the usual variety of method attributes, such as
> `RuntimeInvisibleAnnotations` for annotations on the matcher declaration
> itself.
>
> If we wish to encode information about the matcher _bindings_, we do so
> with
> attributes inside the `Matcher` annotation itself.  Attributes such as
> `Signature`, `ParameterNames`, `RuntimeVisibleParameterAnnotations`,
> etc, can
> appear in a `Matcher` and are interpreted relative to the matcher
> signature or
> descriptor.  So if we had a matcher:
>
> ```
> matcher Foo(@Bar List<String> list) { ... }
> ```
>
> then the `Matcher` would contain the signature attribute corresponding to
> `(List<String>)` and a `RuntimeXxxParameterAnnotations` attribute
> describing the
> `@Bar` annotation on the first "parameter".
>
> #### Reflection
>
> Since matchers are a new kind of class member, they will need a new kind of
> reflective object, and a method that is analogous to
> `Class::getConstructors`.
> The reflective object should extend `Executable`, as all of the existing
> methods
> on `Executable` make sense for patterns (using `Object` as the return
> type.)  If
> the pattern is reflectively invoked, it returns `null` for no match, or an
> `Object[]` which is the boxing of the values in the carrier.
>
>
This surprised me slightly and I'm not sure I follow the reasoning on why
the return value would be boxed and collected into an Object[]?

The design says matcher methods return Object as an opaque descriptor for
the actual implementation carrier object.  Yet, reflection will take that
carrier object, and replace it with an equivalent Object[] resulting in
more allocations and (potentially) boxing on the return path.

I get it's a developer friendly approach but I wonder if it encourages the
wrong mental model about what matchers return?

Would it make sense to add an extra step in that process so the
java.lang.reflect.Matcher instance has a `Object[] resultToArray(Object)`
method?  It's more ceremony (yuck) but allows avoiding the array creation
and maybe the boxing on the return path for allocation-sensitive callers?

I may be overly concerned with the array creation on the return path due to
historical work with MethodHandles which sought to remove the equivalent
argument boxing / array creation from Method::invoke calls.

And speaking of MethodHandles, will there be new
MethodHandles.Lookup.findMatcher , findDeconstructor, etc methods?  Or do
you see them being looked up with the existing find* methods?

> We will then need some additional methods to describe the bindings, so the
> subtype of `Executable` has methods like `getBindings`,
> `getAnnotatedBindings`,
> `getGenericBindings`, `isDeconstructor`, `isPartial`, etc.  These
> methods will
> decode the `Matcher` attribute and its embedded attributes.
>

What does `getBindings` return?  The MethodType describing the bindings?  A
Class[] describing the types of the bindings?  Something else?

--Dan

>
> ## Summary
>
> This design borrows from previous rounds, but makes a number of
> simplifications.
>
>   - The bindings of a pattern are captured in a `MethodType`, called the
> _matcher
>     descriptor_.  The parameters of the matcher descriptor are the types
> of the
>     bindings; the return type is either `V` or the minimal type that
> will match
>     (but is not as important as the bindings.)
>   - Matchers are translated as methods whose names are derived
> deterministically
>     from the name of the matcher and the erasure of the pattern
> descriptor. These
>     are called _matcher methods_.  Matcher methods take as parameters
> the input
>     parameters of the pattern (if any), and return `Object`.
>   - The returned object is an opaque carrier.  Null means the pattern
> didn't
>     match.  A non-null value is the carrier type (from the carrier
> runtime) which
>     is derived from the pattern descriptor.
>   - Matcher methods are not directly invocable from the source language;
> they are
>     invoked indirectly through pattern matching or reflection.
>   - Generated code invokes the matcher method and interprets the
> returned value
>     according to the protocol, using MHs from the carrier runtime to
> access the
>     bindings.
>   - Matcher methods have a `Matcher` attribute, which captures
> information about
>     the matcher as a whole (is a total/partial, a deconstructor, etc) and
>     parameter-related attributes which describe the bindings.
>   - Matchers are reflected through a new subtype of `Executable`, which
> exposes
>     new methods to reflect over bindings.
>   - When invoking a matcher reflectively, the carrier is boxed to an
> Object[].
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-observers/attachments/20230307/09314093/attachment-0001.htm>