Moving from VVT to the L-world value types (LWVT)

Frederic Parain frederic.parain at oracle.com
Tue Jan 23 21:25:06 UTC 2018


Hi John,

thank you for the detailed feedback.

The Q-descriptor is not a fundamental part of the proposal, it is just an unsatisfying
way for class files to express their expectations regarding types they think are value
class types (to differentiate them from object class types). Q-descriptors provide this
information but have drawbacks like the signature matching issue.

Remi’s proposal is appealing because it avoids the signature matching issue.
An attribute is not the most convenient data structure for the JVM, but we can
record the information elsewhere in our meta-data. However, it seems more
brittle because the attribute can easily omitted, unless we make it mandatory
after a given class file format number, with a slightly different syntax where all
classes named in the class files have to be listed, so it can be verified. For
older class file format, the attribute would be absent and all classes are assumed
to be object classes.

We had two brainstorming sessions. yesterday and this morning, trying to figure
out what would be the consequences of having only L-descriptors, with class
files having different assumptions regarding the real nature of a type (object class
or value class), either in the case of VBC migration or simply because of separate
compilation. Some issues are related to the calling/returning conventions for the
JIT compiled code. Some others issues are related to the class loader constraints,
and the fact that a class with the wrong assumption regarding the nature of a class
might prevent the real class from being loaded. The case where a class expects
a Value Based Class (object class type) and the class is in fact a migrated value
class seems to be OK. The case where a class expects a value class, but the
class loader loads an object class seems much more problematic to us.

Regarding the migration of value based classes, trying to prevent null references
from leaking into migrated code seems to be a step to far. We reviewed the issue with
Karen this morning, and it doesn’t seems too dangerous to only check for null
when the reference is stored in a field or array expecting an instance of a value
class.

Thank you,

Fred


> On Jan 19, 2018, at 23:22, John Rose <john.r.rose at oracle.com> wrote:
> 
> On Jan 16, 2018, at 12:56 PM, Frederic Parain <frederic.parain at oracle.com> wrote:
>> 
>> Here’s an attempt to bootstrap the L-world exploration, where java.lang.Object
>> is the top type of all value classes (as discussed during the November meetings
>> in Burlington).
> 
> This is excellent work, Frederic; thank you.  I'm really hopeful that we
> are on the right track.
> 
>> ...
>> Here’s a quick summary of the changes with some consequences on the HotSpot code:
>> - all v-bytecodes are removed except vdefault and vwithfield
> 
> At some point we may want to strip the v-prefix from those survivors.  No hurry.
> 
>> - all bytecodes operating on an object receiver are updated to support values as well,
>>   except putfield and new
> 
> Yep.
> 
>> - single carrier type for both instances of object classes and instances of value classes
>> - this carrier type maps to the T_OBJECT BasicType
>> - T_VALUETYPE still exists but its usage is limited (same purpose as T_ARRAY)
> 
> T_ARRAY can be a confusing source of bugs.  I've always wondered if it was worth it.
> 
>> - qtos TosState is removed
>> - JNI: the jobject type can be used to carry either a reference to an object or an
>>          array or a value. The type jvaluetype, sub-type of jobject, is used when only
>>          a value class instance is expected
>> - Q…; remains the way to encode value classes in signature (fields and methods)
> 
> I'd like to move towards an ACC_VALUE bit on both fields and classes.
> Again, no hurry, but (as in my previous message) I'd like to retire Q-descriptors.
> 
>> - In the constant pool, the CONSTANT_CLASS_info entry type is used to store a
>>  symbolic reference to either an object class or a value class
>> - the ;Q escape sequence is not used anymore in value class names
>> 
>> 
>> One important point of this exercise is to ensure that the migration of Value Based Classes
>> into Value Classes is possible, and doable with a reasonable complexity and costs. In addition
>> to the JVMS update (and consistent with the JVMS modifications), here’s a set of proposals
>> on how to deal with the VBC migration. 
> 
> I'm glad you are doing this analysis, not only because VBC migration is
> a wonderful goal, but also because I think the same analysis is necessary
> just to manage separate recompilation, even if we never decided to
> migrate a single class.
> 
> In short, I see you are leaning hard on Q-descriptors, but I don't think
> you are getting enough value out of them, and they cause serious
> problems.  More comments below… 
> 
>> 
>> Migration of Value Based Classes into Value Classes:
>> - challenges:
>>     - signature mismatch
> 
> Goes away when/if we retire Q-descriptors!
> 
>>     - null
> 
> Can be dealt with by assuming non-null and throwing dynamic NPEs
> as needed where Q types are in play.  Also, we tolerate "polluting nulls"
> along paths where the Q/R distinction is not available, even if (at some
> point later on) we realize that it was a Q all along.  Eventually, the
> polluting null will cause an NPE.
> 
> (In my view, the NPE should happen later than one might prefer if it were
> a true coding error rather than a recompilation artifact.  Catching polluting
> nulls early in the presence of recompilation requires too many heroics.)
> 
>>     - change in behavior
> 
> Yes, that's the tricky part.
> 
>> - proposal for signature mismatch:
>>      - with LWVT, value class types in signatures are using the Q…; format
>>      - legacy code is using signature with L…; format (because VBC are object classes)
>>      - methods will have two signatures:
>>        - true signature, which could include Q…; elements 
>>        - a L-ified signature where all Q…; elements are re-written with the L…; format
>>        - method lookup still works by signature string comparisons
>>        - the signature of the method being looked up will compared against both the
>>          true and the L-ified signatures, if the looked up signature matches the L-ified
>>          signature but not the true signature, it means a situation where legacy code
>>          is trying to invoke migrated code has been detected, and additional work might
>>          be required for the invocation (actions to be taken have to be defined)
>>       - signature mismatch can also occur for fields, this is still being investigating, the
>>         proposal will be updated as soon as we have a solution ready to be published
> 
> This sort of thing is, for me, a rich argument against keeping Q-descriptors.
> 
>> - proposal for null references leaking to migrated code
>>     - having a null reference for a Value Based Class variable or field is valid in legacy code
>>       but it becomes invalid when the Value Based Class has been migrated to a Value Class
>>     - trying to prevent all references with a value class type to get a null value would be very
>>       expensive (it would require to look at the stackmap for each assignment to a local variable)
> 
> Yes.  We have to tolerate polluting nulls where the Q/R distinction is unavailable.
> 
>>    -  the proposed solution is to allow null references for local variable and expression stack slots,
>>       but forbid them for fields or array elements (bytecodes operating on fields and array have to
>>       be updated to throw a NPE whenever a null reference is provided instead of a value class
>>       instance)
> 
> Yes, I think this is on the right track.  On paths where a Q-type is needed
> we do a null check.  That's the Java way.
> 
>>    - null references are likely to be an issue for JIT optimizations like passing values in registers
>>      when a method is invoked. The proposed solution is to only allow null references for value classes
>>      in legacy code, by detecting them and blocking them when leaking to migrated code. The
>>      detection can be done at invocation time, when a mismatch between the signature expected
>>     by the caller and the real signature of the callee is detected (see signature mismatch proposal above)
> 
> At some point, a polluting null might reach code that "knows" there is a Q type
> (and may even "know" that it goes in an xmm register).  That's the point where
> an NPE should be thrown.  In some cases, a deopt might be appropriate, to
> correctly order the NPE by executing interpreter code.
> 
> Note that this combination of techniques does not Q-descriptors.  The lack
> of Q-descriptors doesn't totally destroy the Q/R distinction; it just means you
> have to execute a little further before you get to code which "knows" that
> the null is illegal.
> 
>>   - the null reference should also be detected and blocked when it is used as a return value and the
>>     type of the value to be returned is a value class type 
> 
> Doing this requires (a) Q-descriptors in method returns, (b) Remi's
> ValueTypes table, or (c) toleration of nulls in the interpreter.  (The JIT
> doesn't have to tolerate nulls:  It can deopt if it hits a surprise null,
> or perhaps throw an early NPE.)  So, I am arguing for (c).
> 
>> In addition to the JVMS update, here’s a chart trying to summarize the new checks that will have to
>> be added to existing bytecode when moving the vbytecodes semantic in to a* bytecodes. The categories
>> in the chart are not very precise, but we can use it as a starting point for our discussions. The chart
>> can also help defining which experiments could be done to estimate the costs of the different additional
>> checks needed to be added to existing bytecodes.
> 
> The chart is really helpful, thanks.  More comments later.
> 
> Onward!
> 
> — John
> 
> 



More information about the valhalla-dev mailing list