value type hygiene

Tue May 15 00:57:01 UTC 2018

On May 11, 2018, at 10:48 AM, Dan Smith <daniel.smith at oracle.com> wrote:
> 
>> On May 10, 2018, at 7:36 PM, John Rose <john.r.rose at oracle.com <mailto:john.r.rose at oracle.com>> wrote:
>> 
>> There could be an interface default method, or some other method,
>> which is simultaneously a member of two trees with two different
>> decisions about scalarization vs. buffering.  This can be handled
>> by having the JVM create multiple adapters.  I'd rather forbid the
>> condition that requires multiple adapters as a CLC violation,
>> because it is potentially complex and buggy, and it's not clear
>> we need this level of service from the JVM.
> 
> Use case:
> 
> 
> ---
> 
> Library code
> 
> interface Event {
>     LocalDateTime timestamp();
> }
> 
> ---
> 
> Client 1 code, compiled with reference class LocalDateTime
> 
> class Client1Event implements Event { ... }
> 
> ---
> 
> Client 2 code, compiled with value class LocalDateTime
> 
> class Client2Event implements Event { ... }
> 
> ---
> 
> Various circumstances can lead to two different clients running on a single JVM, including, say, dependencies between different libraries.
> 
> Am I understanding correctly that you would consider loading/invoking these methods to be a JVM error?

(Do you mean "loading these classes"?)

Let's assume, because it is most likely, that the library interface Event
is updated to know the LDT is a VT.

Client1Event is the legacy class.  Depending on what it does with
the Event type, its code is potentially out of date.  We have grounds
to hope it could still run; if it only uses factory methods to obtain
LDT values, and doesn't introduced nulls on its own, then it might
just work.  OTOH, there are lots of corner cases where it might get
itself into trouble.

We sometimes throw AME after loading a no-longer-valid implementation
of an interface.  The AME is thrown later on when an affected API point is
called.  Some gross violations (as when something is no longer an
interface) are sanctioned when the class is loaded.

I am considering various ways to state that Client1Event, despite its
innocent-looking method descriptor, is no longer a valid implementation
of Event after the upgrade of Event.

There are various ways to do detect and report the mismatch:

1. Exclude loading the class, on grounds similar to CLC's.
2. Fail to fill the v-table slot for Event.timestamp, leading to AME.
3. Fill the v-table slot with a null-rejecting version of Client1Event.timestamp.

The earlier ways give more decisive diagnostics, but reject workable
code along with unworkable code.  What you quote above is me
suggesting something like #1.  That would give a fast-fail diagnostic.
I think it's a good first experiment to try, although I suppose we are
likely to find it is too harsh.

A more lenient option is #3, because it sanctions unworkable code only
when that code actually produces a null and tries to mix it into an API
that treats values as non-nullable types.  Doing this requires putting an
adapter of some sort into the v-table entry for the override of the
Event.timestamp method in Client1Event.  The adapter has to agree
to call the legacy method, and then null-check its return value.

I think I prefer #3 to #1, FTR, although I'd like to get away with #1.

Tobias points out that we seem to need two entry points for methods with
optimized calling sequences, one for when all callers and callees agree
on flattening, and another for use in contexts where close coupling of calling
sequences is not desired or not possible.  This may include lambda forms,
reflection, and/or the interpreter.  So #3 has an efficient implementation in
that setup, where Client1Event.timestamp has a non-flattened entry point
for legacy code to call, and one with the upgraded calling sequence, for use
in the v-table.  The flattened entry point simply null-checks the return value;
otherwise it is identical to the non-flattened one.

In the above quoted text, I say, "This can be handled by having the JVM
create multiple adapters".  The reason I want to avoid that, even at the
cost of the harshness of #1, is that a function whose descriptor mentions
N different value types has up to 2^N different calling sequences.
I'd rather pick the one that most optimally applies to the API that defines
the method which creates the v-table slot, and define at most one more,
which boxes everything and is used in all cases where the preferred
calling sequence won't work (such as legacy code wanting to pass null).

It's natural to ask, "Why can't we all be friends here?"  Maybe we could
allow legacy code free use of nulls in all APIs which mention value types,
even new APIs.  Then legacy code will never experience a NPE even
when using upgraded value type values.

I'm pretty sure this option would be much more expensive to implement
than the previous options (1/2/3).  All optimized v-table calling sequences
would be speculative and tentative, flattening only until the first null
shows up, and deoptimizing after that, or at least using data-dependent
slow paths.  And the expense would leak through to users, because bad
actors (who refuse to upgrade) would slow down all code using the
new APIs.

(Reminder:  v-table slot refers to an overridable method descriptor
in a particular class which first mentions that descriptor.  It is a JVM level
concept but is portable across JVMs.  Calling sequences are easier to
optimize in other settings, because we know exactly which method we
are calling.  But v-table slots mediate virtual and interface methods,
which must agree on a common calling sequence for any given v-table
slot.)

— John
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/valhalla-spec-experts/attachments/20180514/a8a247dc/attachment-0001.html>