Records -- current status

Tue Mar 20 13:57:13 UTC 2018

Hi all,

First time posting (though I've been silently following discussions for
quite some time).

As this is my very first message, I might be missing some important points
already discussed (my apologies if this is the case).

Please see my comments inline, thanks for letting the community participate
in these discussions.

Regards,
fps.-

2018-03-16 15:55 GMT-03:00 Brian Goetz <brian.goetz at oracle.com>:
>
> - Extension. The proposal outlines a notion of abstract record, which
provides a "width subtyped" hierarchy. Some have questioned whether this
carries its weight, especially given how Scala doesn't support case-to-case
extension (some see this as a bug, others as an existence proof.) Records
can implement interfaces.

In the linked document (
http://cr.openjdk.java.net/~briangoetz/amber/datum.html), it says that if a
concrete record extends an abstract record, the state vector of the
abstract record must be a prefix of the state vector of the concrete
record. An example:

    abstract record Base(int x, int y);

    record Sub(int x, int y, int z) extends Base(x, y);

So, my question is why Base(x, y) and why not i.e. Base(y, z) or Base (x,
z) or even Base (y, x), etc?

Initially, as fields are referenced by name, I saw no reason for concrete
records to be able to only append new fields to the state vector. However,
when I was to think of a concrete example where it would be useful to relax
the "prefix constraint", I must admit that I couldn't find any practical
example... Anyway, although only from a theoretical point of view, it would
be nice to know the rationale behind this constraint.

>
> - Concrete records are final. Relaxing this adds complexity to the
equality story; I'm not seeing good reasons to do so.

Absolutely agree.

>
> - Additional constructors. I don't see any reason why additional
constructors are problematic, especially if they are constrained to
delegate to the default constructor (which in turn is made far simpler if
there can be statements ahead of the this() call.) Users may find the lack
of additional constructors to be an arbitrary limitation (and they'd
probably be right.)

I also agree here. A typical use case would be to use additional
constructors accepting less fields to provide default values to the
principal or default constructor.

Besides this, I think that copy constructors are widely used (much more
than Cloneable and .clone()), which makes me think about 2 things:

1. What about automatically providing a copy constructor for records?

2. If providing a built-in copy constructor is not desired for all records
(after all, not every record needs to be copied), it would be very useful
to have a succint way to express the need of a copy constructor. For
example:

Instead of writing:

    record Point(int x, int y) {
        public Point(Point p) {
            default.this(p.x, p.y);
        }
    }

We could only write:

    record Point(int x, int y) {
        public Point(Point p);
    }

Here the delegation to the principal constructor could be implemented by
the compiler. Besides, in a copy constructor, there's no need to perform
any validation on the arguments, as they come from an already constructed
record instance. Whether the fields of the source record (the one to be
copied) satisfy the invariants imposed in the overriden default constructor
(if there is one actually defined), or if the fields of the source record
don't satisfy these invariants, e.g. if some mutable field has been
reassigned in the source record by means of a mutator method invoked after
construction (more about this below), it is irrelevant to the record
instance; in any case, there's no need to re-validate the fields against
the invariants.

I think that the built-in implementation of copy constructors should
perform a shallow copy. If the programmer wants to implement a deep copy
mechanism, the copy constructor implementation could be explicitly provided:

    record Address(String street, String city, String country) { }

    record Person(String name, Address address) {
        public Person(Person p) {
            default.this(p.name, new Address(p.address));
        }
    }

>
> - Static fields. Static fields seem harmless.

Yes, no problem with static fields.

>
> - Additional instance fields. These are a much bigger concern. While the
primary arguments against them are of the "slippery slope" variety, I still
have deep misgivings about supporting unrestricted non-principal instance
fields, and I also haven't found a reasonable set of restrictions that
makes this less risky. I'd like to keep looking for a better story here,
before just caving on this, as I worry doing so will end up biting us in
the back.

I think that additional fields can be very dangerous for the little benefit
they might bring to records. In fact, I don't see a real value for them.
Suppose you have:

    record Name(String first, String last) {
        public firstAndLast() { return first + " " + last; }
    }

Why caching the result of first + " " + last in an additional firstAndLast
instance field? For records, it should be enough to expose derived state
via accessor methods. If you want to cache some value (because it's
expensive to calculate), you can always resort to classes, or could maybe
use another record. For the example above, you could have:

    record FullName(Name name, String fullName) {
        @Override
        public FullName(Name name, String fullName) {
            if (!name.firstAndLast().equals(fullName)) {
                throw new IllegalArgumentException();
            }
            default.this(name, name.firstAndLast());
        }

        public FullName(Name name) {
            this(name, name.firstAndLast()); // delegates to the overriden
default constructor
        }
    }

And use it in a decorator fashion:

    FullName someone = new FullName(new Name("John", "Doe"));

    String fullName = someone.fullName(); // John Doe
    String first = someone.name().first(); // John
    String last = someone.name().last(); // Doe

>
> - Mutability and accessibility. I'd like to propose an odd choice here,
which is: fields are final and package (protected for abstract records) by
default, but finality can be explicitly opted out of (non-final) and
accessibility can be explicitly widened (public).

I love final fields by default, I think that finality should be encouraged
by the language. However, I find it very unpleasant to use a keyword (or a
reserved word, reserved type name, etc) to change a field's default
finality. Maybe I'm asking for an impossible here, but isn't there a
mechanism similar to effectively-final variables, to detect whether a final
field is being mutated? Certainly, the compiler is able to detect these
cases, such as when an effectively-final variable is being modified inside
a lambda.

My proposal is to use similar detection mechanics. Instead of throwing a
compilation error when a field is being mutated, the compiler could
internally mark that field as non-final. Of course, fields could only be
reassigned from within explicitly declared mutator methods.

So, if this is at all possible, we wouldn't need to declare fields as
neither non-final nor final: the compiler would automatically figure it
out. In other words, the intention of the programmer to explicitly reassign
a field within a method of the record would make the field implicitly
non-final. (Disclaimer: please bear with me if I'm fantasizing too much
here, as I'm not aware of the internals of the compiler or whether this is
a viable choice. Anyways, just imagining this possibility has been fun).

...

With regards to accessibility, I think that all fields should be public.
Yes, public! But only for reading purposes (fields should only be
reassigned from within setter/mutator methods). If I'm getting the idea
correctly, the spirit of records is to be transparent. We don't need any
encapsulation here. So, why making fields package-private or protected, and
providing public, automatically-built accessors? Instead, we should let the
information (and just the information) be always exposed. I'm well aware
that the Java world hates public fields, and that is OK in the context of
classes. But this is records, not classes, we're living in a totally
different world. Exposing the fields publicly (they'll be public via
deconstructors, after all) would show the clear intention that records are
all about transparent information, and that they sit at the opposite side
of encapsulation (aka information-hiding). Simply put, records with public
fields enforce the concept of information transparency.

But we need to support the uniform access principle... Really? Do we? Why?
Records are a new functionality. We will be already providing access to the
fields via deconstructors when pattern-matching is available, so why should
we respect that principle here as well? This is just an opinion and I don't
want to create polemic, but whereas the uniform access principle is one of
those principles that are beautiful in theory, it's worth noting that it's
been the source of nasty bugs in practice. I'm thinking specifically about
OR mapper frameworks. You invoke a simple getter and you get a query that
performs 12 joins in the background. And you'd better pray the gods if you
were invoking that getter inside a loop... Do we really want to support a
principle that promotes these kind of behavior? Classes are classes and
records are records. They are very different each other, why not making
these differences more evident by changing the way fields are accessed?

If you want to return a defensive copy of some field, you can always write
an explicit accessor that i.e. uses the copy constructor (see above) to
return a field that is in turn a record of another type, or an unmidifiable
map, etc. This exceptional behavior should be explicit IMO, particularly if
fields cannot be reassigned from outside of their enclosing record.

Last but not least, I think it's a de-facto pseudo-standard to use
fieldName() to return Optional<TypeOfField> for single instance fields. The
proposed fieldName() built-in accessors would collide with this
pseudo-standard.

>
> - Accessors. Perhaps the most controversial aspect is that records are
inherently transparent to read; if something wants to truly encapsulate
state, it's not a record. Records will eventually have pattern
deconstructors, which will expose their state, so we should go out of the
gate with the equivalent. The obvious choice is to expose read accessors
automatically. (These will not be named getXxx; we are not burning the
ill-advised Javabean naming conventions into the language, no matter how
much people think it already is.) The obvious naming choice for these
accessors is fieldName(). No provision for write accessors; that's
bring-your-own.

I think that this quite supports my previous arguments, except for the read
accessors. If we don't want to burn Javabean naming conventions into the
language, why should we burn any other convention into it?

Agree with explicit write accessors, with the caveat of my above proposal
to make reassigned fields effectively-non-final.

>
> - Core methods. Records will get equals, hashCode, and toString. There's
a good argument for making equals/hashCode final (so they can't be
explicitly redeclared); this gives us stronger preservation of the data
invariants that allow us to safely and mechanically snapshot / serialize /
marshal (we'd definitely want this if we ever allowed additional instance
fields.) No reason to suppress override of toString, though.

Absolutely agree. If you want to provide a custom implementation for
equals/hashCode, just use a class. Besides, this opens the possibility of
hash code randomization for records, i.e. to use some seed dependent on the
current run of the JVM to be used as the base to calculate hash codes
(something akin JDK9's immutable sets and maps).

> Records could be safely made cloneable() with automatic support too (like
arrays), but not clear if this is worth it (its darn useful for arrays,
though.) I think the auto-generated getters should be final too; this
leaves arrays as second-class components, but I am not sure that bothers me.

I've never used neither the Clonable interface nor the .clone() method in
my life. If I ever had the need to create copies of an object, I just
declared a copy constructor and used it instead. I only talk from my
experience, but I doubt .clone() and Clonable are widely used. Besides it's
(seen as) broken, implementing a copy constructor is safer and easier than
implementing Cloneable and overriding the .clone() method (without
forgetting to make it public). If deep cloning is needed, this should be
accomplished in the copy constructor.