[lworld] Handling of missing ValueTypes attributes

Tue Jul 10 19:55:09 UTC 2018

Thanks, Karen; I blundered into talking about post-LW1 intentions
when the questions were about LW1.  So, in LW1, our intention is
to cut off, rather than adapt, mismatched accesses.  This means
lying in wait at v-table preparation time and field and method linkage
time, throwing an ICCE if anyone tries to set up a mismatched
access.

Here are a few more comments, mainly to make sure I am saying
the right things, and perhaps to throw some light in dark corners…

The usual form of a mismatch access is a method-to-method call,
but it can also be a field reference or a use of a class from a bytecode
like instanceof or anewarray.

Related to method-to-method calls, there is also v-table slot packing,
which is how method overriding is linked physically in HotSpot.
A prepared v-table slot is really a redirected call site, to execute later.
(A subclass inherits from its super not only a prefix of instance fields,
but also a prefix of v-table slots; the subclass can extend either.)
That's why we do all kinds of method analysis at preparation time,
and why we can reasonably speak of v-table slots, as well as methods,
as sources and targets of calls.  For me it helps it think of a v-table
slot as a physical entity, like an instance field, but for the whole class.

This trick allows me to visualize override relations between methods.
A v-table slot is inherited into a subclass from a superclass, and
thus v-table slots (in their various locations) are entities with their
own relations and properties.  When I read about "method overrides"
the language is about relations between methods, and so it is
hard then to reason about relations between *those* relations
unless I can visualize them somehow.  Anyway, that's why I keep
going on about v-tables, even though they are "just an implementation
trick".

On Jul 10, 2018, at 11:24 AM, Karen Kinnear <karen.kinnear at oracle.com> wrote:
> 
> Let me clarify the LW1 proposal. This model is designed to NOT take into account migration of value-based classes, which we will revisit
> much farther downstream. 
> 
> For LW1 - the goal is to fail with an ICCE if there are inconsistencies with value type expectations. The question is when do we catch
> the problem and fail, and do we allow field access and method invocation when the information is inaccurate. 
> 
> Ioi - here is the proposal 
> http://cr.openjdk.java.net/~acorn/value-types-consistency-checking-details.pdf
> (As John pointed out in another email - I should update it to describe the ValueTypes attribute - for now see the not-yet-out latest JVMS draft - 
> 
> http://cr.openjdk.java.net/~fparain/L-world/L-World-JVMS-4e.pdf
> 
> For local methods there are two parts:
> 1) if a type in a descriptor is in the ValueTypes attribute, we will eagerly load it at preparation time. If it is not a value type, we will throw ICCE.
> 2) if a type in a descriptor is NOT in the ValueTypes attribute: see *5 Method Invocation: VT Point NOT in ValueTypes attribute in the Consistency
> Checking link. Extracted here - let me know if this does not make sense:
> 
> Point is a value type.
> ClassA contains method m1(Point point). Point is NOT in Class A’s ValueTypes attribute. I think this matches Ioi’s and Tobias’ examples.

(Yes, I think so too, plus a call to m1 from some m2 in another class,
which as you say is case 2 below.)

> case 1: method overriding
> ClassA’s method m1 overrides ClassZZ’s method m1, and m1 is in ClassZZ’s ValueTypes attribute. Preparation of ClassA will fail with ICCE when creating the vtable.

ClassA inherits a v-table slot from ClassZZ that wants unboxed Points
(passed as fields) but ClassA's method doesn't fit in that slot, since it
uses Point refs (passed as oop).  Does that sound right?

Note that a calling sequence might not physically unbox a Point,
because of some implementation heuristic or limitation.  However,
the JVM must enforce ref/value consistency even in such a case,
because the JVM needs to give a predictable model for error checks.
If the JVM has the *option* to unbox a value, it needs to protect
that option by throwing errors on inconsistent code even in cases
where it doesn't *exercise* the option.

As a proxy for "could I unbox this Point value?", you can also ask
"could I pass a null for this Point value?"  The two conditions are
mutually exclusive, and the middle ground (of passing as oop
but excluding null) is not visible to the user of the JVM; the user
can't see physical calling sequences inside a virtual machine.

This is why nullability has arisen as a way to test whether a variable
is "really a value" or is just passed by reference.  Nullability *also*
has an independent role in managing certain legacy APIs where
null is used as a sentinel value.  When we work out all the details,
the JVMS might end up talking about identity as a property of
the loaded type itself, but nullability as a property of various
containers of that type.  Both identity and nullability need to be
suppressed, via independent means in LW1, in order to fully
flatten heap data and scalarize method arguments and returns.

(As instance fields and v-table slots are extended in parallel
across a class hierarchy, value type optimizations follow along
both avenues.  There is a deep duality between method parameters
and object fields operating here.)

> ClassD extends ClassA, ClassD’s method m1 overrides ClassA’s m1 and m1 is in ClassD’s ValueTypes attribute. Preparation of ClassD will fail with ICCE when creating the vtable due to the mismatch.
> Note ClassA keeps going - I don’t know of a way to bubble up the failure since ClassA may be in use.

So:  ClassA *creates* a v-table slot that wants Point refs.  Then when
ClassD tries to fill that slot with a method that wants unboxed Points.

(Note that if A extends ZZ, ClassA already died during preparation,
so ClassD will also die during preparation.  But here ZZ is not present,
so A gets to determine the shape of the v-table slot.)

Alternate scenario:  ClassA inherits a v-table slot for m1 from ClassZZ that
wants unboxed Points, and ClassA does *not* override m1.  ClassA commits
no real foul, since it doesn't try to change the contents of the slot.  I think we
let this pass, since we only look at overrides, but the details are always tricky.

> case 2:
> ClassB has Point in ValueTypes attribute and invokes ClassA.m1.
>   - ClassB can successfully create an instance of Point
>   - at method resolution time, caller-callee consistency check will throw ICCE, so ClassB can not invoke ClassA.m1
> (Tobias - I think this is your last paragraph “Now another compiled method)

Here ClassB wants to attach to ClassA's v-table slot for m1.  Just like
m1 itself, the slot has a calling sequence.  In LW1 we aren't messing with
on-the-fly adaptation, so when B tries to link to A.m1 (either the v-table
slot or the method itself), the calling sequences don't align which causes
ICCE.

> case 3:
> ClassC does NOT have Point in ValueTypes attribute and invokes ClassA.m1
>   ClassC is able to invoke ClassA - caller/callee check passed consistency (without checking reality)
>   ClassC can pass a null Point

In that case old code just does the old thing.  One might ask how does
the old code get a reference to the new value, if all the calling sequences
are locked into alignment?  The answer is that value instances are very
mobile in an LW* JVM; they can be laundered through java.lang.Object,
or passed as array or field elements, or through JNI, or through untyped
MH internals.  It's hard to close all the holes.

Thus, masking a Point as an Object allows old code to grab it.  But I think
we do make it pretty difficult for the old code to re-type the thing as a non-value
Point.  In order to do that, it must issue a checkcast (or equivalent), and in
LW1 (as described in Karen's very thorough write-up) a checkcast *does*
perform a consistency check, because it must resolve its CONSTANT_Class,
and those are *always* checked against Reality.  That means the LW1 JVM
will throw ICCE on old code that attempts to cast an Object to a Point (if Point
is a value and the old code didn't know that).  The code for detecting such a
mismatch is here in Fred's patch:

  http://cr.openjdk.java.net/~fparain/VTAttributeChecks/webrev.02/src/hotspot/share/oops/constantPool.cpp.udiff.html

(Karen did I get that right?)

> So - from a compiler perspective - only classes that share the same ValueTypes attribute information about Point can call ClassA.m1.
> 
> From an execution perspective, we would like to ensure that neither ClassC nor ClassA can get their hands on an instance of Point, so
> we are only passing null here. Tried to close all of these holes - could use review.

It's hard to close all the holes, as noted above.  And some holes we don't want
to close:  Putting a Point into a List<Object> or Object[] means old code *can*
grab it and operate on it, as long as the old code doesn't mention the type
Point in its constant pool.  That's an important interoperability property we
want to keep, if we can.

> Note to John - consistency checking for array elements
> relative to ValueTypes attribute is part of closing this hole, i.e. ensuring that a class that has the wrong information about whether a type
> is a value type can not get their hands on an instance of that value type.

OK, that makes sense.  Thanks for explaining.  If we want to allow old code
to make a closer acquaintance with the point Point, then we have to implement
a new kind of "view" of Point, a reference to a buffered value, which is nullable.
And we are avoiding that in LW1, pending further analysis.

If we add nullable references to values in the JVM as a feature post-LW1, then
we need to work out the various conversion and adaptation paths, as well as
decide whether those two views of Point should be spelled differently (as JVM
type descriptors), or spelled the same with contextual interpretation.  If they
are different (say R-Point vs. Q-Point), then we need lots of tech for descriptor
shifting adapters, and we also need to chose which spelling (if either) is the
legacy spelling L-Point.  If they are the same, then it's the legacy spelling all
around, and we need to be very careful to manage the side-channel information
which indicates the Q/R distinction.  I hope to discuss this more at the JVMLS.

> Does this make sense for LW1?

Yes; hopefully my comments are correct also.

> thanks,
> Karen
> 
> p.s. Frederic checked in round 1 of the consistency checking - but did not get round 2 in before he left for vacation. He is out this
> week and the next. I attached his patch out for review if you want to use it to test with to see if that helps the compiler. He will be back I
> believe the 23rd. I will check with Harold on Frederic’s verifier question - we may want to push this (without perhaps the new test) before
> he gets back so you can use it.
> 
> Note - John asked for some refactoring - given how tight the time is before EA and the vacation schedule - that will be a postlw1 rfe.

As I think I said to Fred, that's fine with me.

— John