RFR (L) JDK-8230199: consolidate signature parsing code in HotSpot sources
Lois Foltan
lois.foltan at oracle.com
Tue Feb 4 18:42:54 UTC 2020
On 2/3/2020 11:07 PM, David Holmes wrote:
> Hi Lois,
>
> On 4/02/2020 7:39 am, Lois Foltan wrote:
>> On 2/3/2020 12:18 AM, David Holmes wrote:
>>> Hi Lois,
>>>
>>> Sorry for the delay in looking at this. This is a nice set of
>>> consolidations and simplifications. I few mostly minor
>>> comments/queries below.
>>
>> Hi David,
>>
>> No problem. Thank you for reviewing! Comments interspersed below.
>
> Responses also below.
>
>> Updated webrev that addresses your concerns below is at:
>> http://cr.openjdk.java.net/~lfoltan/bug_jdk8230199.3/webrev/index.html
>
> Thanks. I didn't spot any further issues.
>
>>>
>>> On 29/01/2020 6:50 am, Lois Foltan wrote:
>>>> On 1/27/2020 4:36 PM, coleen.phillimore at oracle.com wrote:
>>>>>
>>>>> Lois, So many nice cleanups in this patch!
>>>>
>>>> Thanks for your review Coleen! Responses interspersed below. New
>>>> webrev is at:
>>>> http://cr.openjdk.java.net/~lfoltan/bug_jdk8230199.2/webrev/index.html
>>>
>>> In the sharedRuntime_<cpu> changes I'm curious about the change to
>>> the assertion:
>>>
>>> - assert(in_sig_bt[i] == ss.type(), "must match");
>>> + assert(in_sig_bt[i] == ss.type() ||
>>> + in_sig_bt[i] == T_ARRAY, "must match");
>>>
>>> That would seem to be an existing bug - no? Given the code is:
>>>
>>> 2001 if (in_sig_bt[i] == T_ARRAY) {
>>> ...
>>> 2009 } else {
>>> 2010 out_sig_bt[argc++] = in_sig_bt[i];
>>> 2011 in_elem_bt[i] = T_VOID;
>>> 2012 }
>>> 2013 if (in_sig_bt[i] != T_VOID) {
>>> 2014 assert(in_sig_bt[i] == ss.type() ||
>>> 2015 in_sig_bt[i] == T_ARRAY, "must match");
>>>
>>> then we mustn't have any tests that actually exercise this otherwise
>>> we would fail the assert! Or does ss.type() now behave differently
>>> to before?
>>
>> The call to ss.skip_array_prefix(1) ahead of the ss.is_primitive()
>> assert does cause ss.type() to return the underlying type, not the
>> original T_ARRAY. Thus the following assert at line #2014 in the
>> code sample above, in_sig_bt[i] would not equal ss.type() in the
>> array case, so that check had to be added.
>
> Ah I see. Thanks for clarifying.
>
>>>
>>> 2008 // else what is in_elem_bt[i]?
>>>
>>> This comment/question doesn't seem to serve any purpose. If we think
>>> there couldn't possibly be anything other than an array of
>>> primitives then we should probably add some form of guarantee - just
>>> as the original code caught that case with ShouldNotReachHere.
>>
>> I've removed the comment. In a previous webrev I added the
>> ss.is_primitive() assert based on Coleen's feedback.
>>
>>>
>>> ---
>>>
>>> src/hotspot/share/classfile/systemDictionary.cpp
>>>
>>> Minor nit:
>>>
>>> 2181 Symbol* obj_class = ss.as_symbol();
>>> 2182 klass = constraints()->find_constrained_klass(obj_class,
>>> class_loader);
>>>
>>> You introduced the local obj_class but it is only used once and so
>>> it not necessary. (In other places you removed use-once locals.)
>>
>> Fixed.
>>
>>>
>>> ---
>>>
>>> src/hotspot/share/classfile/verificationType.cpp
>>>
>>> Observation: The switch in get_component seems to indicate a missing
>>> type2name style conversion helper.
>>>
>>> 134 case T_ARRAY:
>>> 135 case T_OBJECT: {
>>> 136 guarantee(ss.is_reference(), "unchecked verifier input?");
>>> 137 Symbol* component = ss.as_symbol();
>>>
>>>
>>> Is it that case as_symbol returns a class name without envelope, but
>>> an array type name with envelop? I'm trying to reconcile the old and
>>> new code where T_ARRAY and T_OBJECT were handled distinctly.
>>
>> That is correct except for your use of terminology. An array type
>> does not have an "envelope". Only a class can have an envelope
>> defined as a leading 'L' and ending ';'. For a class,
>> SignatureStream::as_symbol() returns a Symbol* without envelope (see
>> calls to raw_symbol_begin and raw_symbol_end at the start of
>> SignatureStream::find_symbol()). For an array,
>> SignatureStream::as_symbol() will leave the leading bracket, '['. And
>> this behavior matches the current code in verificationType.cpp which
>> calls create_temporary_symbol with a '1' begin position for an array
>> but a '2' begin position for a class.
>
> Thanks for clarifying - the 1 and 2 offsets confused me in the
> original code because I forgot we were already dealing with an array
> whose component type may then be an array or a class. So given for
> example an initial type:
>
> [[LFoo; // Foo[][]
>
> we produce [LFoo;
>
> but given:
>
> [LFoo; // Foo[]
>
> we produce Foo
>
> In the new code I'd overlooked the effects of ss.skip_array_prefix(1).
>
>>>
>>> ---
>>>
>>> src/hotspot/share/oops/constMethod.cpp
>>>
>>> ! set_result_type((BasicType)0);
>>>
>>> This change seems a little odd. Why 0? What about T_ILLEGAL if not
>>> intended to be a real type?
>>
>> At the point a ConstMethod is constructed, the result type of the
>> method has not yet been determined. So just like setting the result
>> type to T_VOID, T_ILLEGAL would also not be correct since this could
>> in the end be a valid result type. See the assert in
>> ConstMethod::result_type(), we want that to trigger if the result
>> type is either never set or incorrectly set.
>
> Okay. Maybe we should add "T_UNINITIALIZED = 0" to the enum?
I like this idea but am going to open a follow on RFE for it because it
could be applicable to other places in the code as well and we should
cover them all if we go forward with it.
>
>>>
>>> ---
>>>
>>> src/hotspot/share/prims/methodHandles.cpp
>>>
>>> 549 case T_VOID:
>>> 550 case T_INT: case T_LONG: case T_FLOAT: case T_DOUBLE:
>>>
>>> This is an inconsistent and unusual style. One case per line is the
>>> common norm.
>>
>> Fixed.
>>
>>>
>>> ---
>>>
>>> src/hotspot/share/runtime/signature.hpp
>>>
>>> 57 // The primitive field types (like 'I') correspond 1-1 with type
>>> codes
>>> 58 // (like T_INT) which part of the specfication of the 'newarray'
>>> 59 // instruction (JVMS 6.5, section on newarray).
>>>
>>> There is a word missing - perhaps "which form part of ..." ?
>>>
>>> 73 // "envelope" syntax which starts with 'L' (or 'Q') and ends
>>> with ';'.
>>>
>>> Can we add 'Q' when/if we get Q-types please :)
>>>
>>> 101 // determine if it (in fact) cannot be a class name.
>>>
>>> The use of "(in fact)" seems a little odd, and elsewhere is used
>>> without parentheses.
>>>
>>> 389 case T_BYTE: case T_SHORT: case T_INT:
>>> 390 case T_BOOLEAN: case T_CHAR:
>>>
>>> Odd case style again.
>>>
>>> 397 #ifdef _LP64
>>> 398 pass_double(); _jni_offset++; _offset += 2;
>>> 399 #else
>>> 400 pass_double(); _jni_offset += 2; _offset += 2;
>>> 401 #endif
>>>
>>> How about:
>>> int jni_offset = LP_64_ONLY(1) NOT_LP64(2)
>>> pass_double(); _jni_offset += jni_offset; _offset += 2;
>>>
>>> Ditto for T_LONG case.
>>
>> All above fixed in signature.hpp.
>>
>>
>>>
>>> 528 Symbol* as_symbol() {
>>> 529 return find_symbol();
>>> 530 }
>>>
>>> Is the new as_symbol equivalent to the old as_symbol_or_null() ? It
>>> doesn't appear to allow NULL return yet there are callsites that
>>> still check the result for NULL (ie nmethod.cpp, method.cpp,
>>> signature.cpp).
>>>
>>> Do we no longer need an allocation constrained version of as_symbol() ?
>>
>> We no longer need an allocation constrained version. In current
>> code, there are only 2 calls that use as_symbol_or_null. In
>> nmethod::print_nmethod_labels() - in this case I would think you
>> would want the type of the parameter to be printed even if it wasn't
>> in the symbol table. And in
>> Method::has_unloaded_classes_in_signature - in this case I think it
>> is more accurate to hinge the determination on whether a class is
>> unloaded or not on an actual call to SystemDictionary::find and not
>> on whether it is in the symbol table.
>
> Sorry I'm still unclear whether as_symbol() can ever return NULL? If
> it can't then the existing NULL checks should be removed. If it can
> then many other uses may need to add NULL checks.
Signature::as_symbol can never return a NULL. I have removed the 2
checks for null in nmethod::print_nmethod_labels() and
Method::has_unloaded_classes_in_signature(). See
http://cr.openjdk.java.net/~lfoltan/bug_jdk8230199.4/webrev/index.html
Thanks,
Lois
>
> Thanks,
> David
>
>> Thanks,
>> Lois
>>
>>>
>>> ---
>>>
>>> Thanks,
>>> David
>>> -----
>>>
>>>
>>>>>
>>>>> http://cr.openjdk.java.net/~lfoltan/bug_jdk8230199.0/webrev/src/hotspot/cpu/sparc/sharedRuntime_sparc.cpp.frames.html
>>>>>
>>>>>
>>>>> 1910 ss.skip_array_prefix(1); // skip one '['
>>>>> 1911 if (ss.is_primitive())
>>>>> 1912 in_elem_bt[i] = ss.type();
>>>>> 1913 // else what is in_elem_bt[i]?
>>>>>
>>>>> This code is for the not-well kept secret JavaCritical, where the
>>>>> native function can only take an array of primitives (checked
>>>>> elsewhere that I can't find right now). So I believe this code
>>>>> can just assert ss.is_primitive() for all these cases. Only code
>>>>> on solaris sparc uses this feature, so you can run tests on sparc
>>>>> to verify.
>>>>
>>>> Done. Changed all to:
>>>>
>>>> out_sig_bt[argc++] = T_INT; out_sig_bt[argc++] = T_ADDRESS;
>>>> ss.skip_array_prefix(1); // skip one '[' assert(ss.is_primitive(),
>>>> "primitive type expected"); in_elem_bt[i] = ss.type(); // else what
>>>> is in_elem_bt[i]?
>>>>
>>>> Will run a sparc build & test.
>>>>
>>>>>
>>>>> http://cr.openjdk.java.net/~lfoltan/bug_jdk8230199.0/webrev/src/hotspot/share/classfile/systemDictionary.cpp.frames.html
>>>>>
>>>>>
>>>>> 2207 if (!Signature::is_array(class_name)) {
>>>>>
>>>>>
>>>>> Did you try adding SignatureStream before this, to avoid the
>>>>> Symbol refcounting madness?
>>>>
>>>> Yes I did try that. However, the issue with that approach is that
>>>> you may have a non signature string in the case where it is not an
>>>> array. This is the condition of the first part of the if statement
>>>> on line #2207. Here you can just simply use the Symbol* as the
>>>> constraint name. If you try to construct a SignatureStream with
>>>> this non-signature Symbol* a crash occurs immediately.
>>>> SignatureStream assumes the Symbol* is a field or method signature
>>>> not just a plain name for example something like "java/lang/String"
>>>> without a leading 'L' or a leading '('.
>>>>
>>>>>
>>>>> http://cr.openjdk.java.net/~lfoltan/bug_jdk8230199.0/webrev/src/hotspot/share/jvmci/jvmciCompilerToVM.hpp.frames.html
>>>>>
>>>>>
>>>>> hpp files shouldn't include an inline.hpp file. Can this include
>>>>> oop.hpp instead?
>>>>
>>>> I removed the inclusion of oop.inline.hpp and removed the following
>>>> assert in JavaArgumentUnboxer::do_type().
>>>>
>>>> 161 assert(arg->klass() == SystemDictionary::box_klass(type),
>>>> "already checked");
>>>>
>>>> It turns out that in the statement prior to this assert, a call is
>>>> made to JavaArgumentUnboxer::next_arg(type) which already does the
>>>> check for us. Thus the "already checked" comment. No need for a
>>>> duplicate assert.
>>>>
>>>>>
>>>>> http://cr.openjdk.java.net/~lfoltan/bug_jdk8230199.0/webrev/src/hotspot/share/oops/symbol.cpp.frames.html
>>>>>
>>>>>
>>>>> Should SignatureStream arguments be const references? Or should
>>>>> they be written to?
>>>>
>>>> I was able to add a const to the SignatureStream& parameter for the
>>>> static method print_class(). For print_array(), however, I was not
>>>> able to since ss.skip_array_prefix() does write to
>>>> SignatureStream::_type field the type after skipping the array
>>>> brackets.
>>>>
>>>>>
>>>>> This looks really good!
>>>>
>>>> Thanks again!
>>>> Lois
>>>>
>>>>> Coleen
>>>>>
>>>>>
>>>>> On 1/24/20 4:41 PM, Lois Foltan wrote:
>>>>>> Please review the following enhancement to consolidate signature
>>>>>> parsing code in Hotspot sources. This change removes duplicate
>>>>>> blocks of code that parse field or method signatures, provides a
>>>>>> new Signature class to support basic signature queries on Symbol
>>>>>> operands and enhances the SignatureStream class to parse field
>>>>>> signatures in addition to methods.
>>>>>>
>>>>>> open webrev
>>>>>> at:http://cr.openjdk.java.net/~lfoltan/bug_jdk8230199.0/webrev/
>>>>>> <http://cr.openjdk.java.net/~lfoltan/bug_jdk8230199.0/webrev/>
>>>>>> bug link: https://bugs.openjdk.java.net/browse/JDK-8230199
>>>>>> contributed-by: Lois Foltan, John Rose
>>>>>>
>>>>>> Testing: hs-tier1-8, jdk-tier1-2
>>>>>>
>>>>>> Thanks,
>>>>>> Lois
>>>>>
>>>>
>>
More information about the hotspot-runtime-dev
mailing list