Revisiting field references

Tue Jun 4 20:50:57 UTC 2019

----- Mail original -----
> De: "Alan Malloy" <amalloy at google.com>
> À: "amber-spec-experts" <amber-spec-experts at openjdk.java.net>
> Envoyé: Mardi 4 Juin 2019 00:19:10
> Objet: Revisiting field references

Hi Alan,
i'm sorry but i've more questions than answers,

> Hello, amber-spec-experts. I understand that "field references" is an
> idea that was considered when other member references were being
> implemented, and it seems to have been a "well, maybe someday"
> feature: nothing fundamentally wrong with it, just not worth delaying
> method references for. Google is interested in reopening that
> discussion, and in working on the implementation if a satisfactory
> design can be found.

The problem with field ref is not the implementation it's the semantics, the question being, a field ref is it a way to expose the getter method ref (and the setter method ref) or is it more like a reified Property.

> 
> For the remainder of this message, we will use this class definition
> for examples:
> 
> public class Holder {
>    private int field;
> }
> 
> This class contains only an instance field, but everything in this
> document applies equally in the case of static fields, except of
> course that they can’t be bound and won’t expect a receiver argument.
> 
> Additionally, most of this document will assume that Holder::field is
> the syntax used for creating an unbound reference to that field. This
> feels very natural of course, but there is a section about the
> tradeoffs of reusing the :: token for fields.
> 
> Getter as Supplier
> 
> The most obvious thing you can do with a field is read it.
> Holder::field could be a ToIntFunction<Holder>, and this::field an
> IntSupplier (assuming this is a Holder). I suspect that a feature that
> does this and no more would actually cover a majority of use cases:
> most people who today want a field reference probably just want a
> shorter version of a lambda that reads a field. However, this by
> itself is not really a very compelling addition, simply because it
> doesn’t buy us much: the “workaround” of writing the lambda by hand is
> not very painful or error-prone, so permitting a reference instead,
> while nice, is not transformative. However, there are some other
> things we could do with field references, which may make the feature
> more worthwhile.

We already have that for free with a record,
if you write:
  record Holder(int field)

then those statements already compile
  ToIntFunction<Holder> fun = Holder::field;
  IntSupplier fun2 = new Holder()::field;

because the generated getter as the same name as the field.

> 
> Setter as Consumer
> 
> The most obvious difference between fields and methods is that while
> there's only one thing to do with a method (invoke it), you can either
> read a field or write it. So, while we’ve already established that
> this::field could be an IntSupplier, it could, depending on context,
> instead be an IntConsumer instead, setting the field when invoked.
> Likewise Holder::field could be an ObjIntConsumer<Holder> instead of a
> ToIntFunction<Holder>.
> 
> This seems natural enough, but merits discussion instead of just being
> included in the feature because it is “obvious”. Setters are more
> complicated than getters. The most obvious complication is that they
> should be illegal if the field is final. More subtly, source code may
> become harder to understand when the expression this::field may mean
> two very different things, either a read or a write. The compiler
> should have enough type information in context to disambiguate, or
> give an appropriate diagnostic when a use site is ambiguous, e.g. due
> to overloading, but this information can be difficult for a developer
> to sort out manually, making every use site a debugging puzzle. They
> must figure out the target type of the reference to determine whether
> it is a read or a write.
> 
> Increased Transparency
> 
> Another appealing thing to do with a field reference is to make it
> more transparent than a simple lambda. We could have some sort of
> FieldReference object describing the class in which the field lives,
> the name and type of the field, and which object (if any) is bound as
> the receiver. This FieldReference object would expose get, and
> possibly set, methods for the referred-to field. Of course this looks
> a lot like java.lang.reflect.Field; but instead of one final class
> using reflection to handle all fields of all classes, we can use the
> lambda meta-factory (or something like it) to generate specialized
> subclasses, which can conveniently be bound to receivers as well as
> being faster.

that the Property, i was talking about above, but don't we already have VarHandle for that ?
that VarHandle can be initialized using a lazy static final field (see issue 8209964)
or by compiler intrinsics (JEP 348).

> 
> An advantage of supporting this is that it could enable libraries that
> currently accept lambdas to generate more efficient code. For example,
> consider
> 
> Comparator<Animal> c =
>  Comparator.comparing(a -> a.name)
>            .thenComparing(a -> a.species)
>            .thenComparingInt(a -> a.mass);
> 
> A perfectly reasonable Comparator, and much more readable than a nest
> of if-conditions written by hand. But if used in a tight loop to
> compare many animals, this is quite expensive compared to the
> hand-written version, because each comparison may dispatch through
> many lambdas, and this it not easy for the JIT to inline. If we really
> wanted to allow Comparator combinators to be used in
> performance-sensitive situations, Comparator could have an optimize()
> method that attempts to generate bytecode for an efficient comparator
> in a way similar to what the lambda meta-factory does.

the issue of that code  is that c2 thinks that it's a recursive call and stop the inlining, but i see this more as a limitation of the way c2 currently works.

Anyway, yes, that's the main difference between the two semantics, either you have method references or you have a full blown reified object. But it can also be seen as limitation of the current method reference that can not be seen as an expression tree like you do in C#. In that case, the user can choose which representation it wants.

> 
> Even without field references, that optimize() method could eliminate
> some lambda calls: instead of a chain of lambdas for each
> .thenComparing call, it could be unrolled into 3 if statements. But
> we’d still have 3 lambdas left, to compute the values to compare to
> each other. If we could pass in a field reference, the optimize()
> method could introspect on those, allowing it to emit getfield
> bytecodes directly, saving more indirection, and resulting in the same
> bytecode you could get by writing this comparator by hand.
> 
> I hope it goes without saying that I am not proposing to actually
> implement Comparator.optimize any time soon: it’s just a convenient,
> well-known example of the kind of library that could be gradually
> improved by promoting field references from “sugar for a lambda” to
> reified objects.
> 
> Note that if we reify field references, there will surely be some
> people who ask, “why not method references?” I think it is much more
> difficult to do this, because methods can be overloaded. Which
> overload of String::valueOf did you want to reify as a MethodReference
> object? When we use these as lambdas, context can give us a hint; when
> crystalizing them as a descriptor object we will have no context. So,
> there seems to me to be good reason to push back against this request,
> but it is a choice we should make deliberately.
> 
> Annotation parameters
> 
> Last, if we had such a FieldRef descriptor, we might like to be able
> to use them as annotation parameters, making it possible to be more
> formal about annotations like
> 
> class Stream {
>  private Lock lock;
>  @GuardedBy(Stream::lock) // next() only called while holding lock
>  public int next() {...}
> }
> 
> Probably this would mean having FieldReference implement Constable, so
> that Holder::field could be put in the constant pool, along with other
> annotation parameters. This also suggests that a FieldReference object
> should not directly store the bound receiver, since that could not be
> put in the constant pool; instead we would want a FieldReference to
> always be unbound, and then some sort of decorator or wrapper that
> holds a FieldReference tied to a captured receiver.

We still can not use method ref (the unbound one) in annotation.

> 
> Open Questions
> 
> The first set of questions is: are these all reasonable, useful
> features? Am I missing any pitfalls that they imply?
> 
> One looming design question is unfortunately syntax: is Foo::x really
> the best syntax? It's very natural, but it will be ambiguous if Foo
> also has a method named x. To preserve backwards compatibility with
> code written before the introduction of field references, we would
> obviously need to resolve this ambiguity in favor of any applicable
> method reference over any applicable field reference. It would surely
> be too extreme to say that it's impossible to get a field reference
> when a method with the same name exists. So if you really want the
> field reference in a context like this, we could introduce some
> alternate syntax to clarify that: Foo:::x, or Foo..x, for example: the
> details don't have to be sorted out at this time, as much as we need
> to decide whether to use any new token at all or just reuse the ::
> token.
> 
> But this tie-breaker strategy has a problem: it solves backwards
> compatibility, while leaving a subtle forward-compatibility pitfall.
> Holder::field currently resolves to a field reference, but suppose in
> the future someone adds a method with the same name. As discussed
> before, we must resolve conflicts in favor of methods, and so
> Holder::field suddenly becomes a method reference next time you
> compile the client code. Now class authors can change which member is
> being accessed by adding a new member, which seems dangerous. But
> maybe it's fine - adding new overloads of an existing method can
> already do that, if clients were relying on autoboxing or other type
> coercions.
> 
> We could avoid the difficulty by having no syntactic overlap between
> field and method references: Holder::toString for methods only,
> Holder:::field for fields only. That's unlikely to be popular, and
> indeed it is a bit ugly. Is it better to accept the small danger of
> ambiguity?
> 
> Finally, if anyone has implementation tips I would be happy to hear
> them. I am pretty new to javac, and while I've thrown together an
> implementation that desugars field references into getter lambdas it’s
> far from a finished feature, and I’m sure what I’ve already done
> wasn't done the best way. Finding all the places that would need to
> change is no small task.

I will just add that there is also no syntax for an array ref syntax. Something you get with a VarHandle.

Rémi