RFR (L) JDK-8230199: consolidate signature parsing code in HotSpot sources

Lois Foltan lois.foltan at oracle.com
Tue Feb 4 18:42:54 UTC 2020


On 2/3/2020 11:07 PM, David Holmes wrote:
> Hi Lois,
>
> On 4/02/2020 7:39 am, Lois Foltan wrote:
>> On 2/3/2020 12:18 AM, David Holmes wrote:
>>> Hi Lois,
>>>
>>> Sorry for the delay in looking at this. This is a nice set of 
>>> consolidations and simplifications. I few mostly minor 
>>> comments/queries below.
>>
>> Hi David,
>>
>> No problem.  Thank you for reviewing!  Comments interspersed below. 
>
> Responses also below.
>
>> Updated webrev that addresses your concerns below is at: 
>> http://cr.openjdk.java.net/~lfoltan/bug_jdk8230199.3/webrev/index.html
>
> Thanks. I didn't spot any further issues.
>
>>>
>>> On 29/01/2020 6:50 am, Lois Foltan wrote:
>>>> On 1/27/2020 4:36 PM, coleen.phillimore at oracle.com wrote:
>>>>>
>>>>> Lois,  So many nice cleanups in this patch!
>>>>
>>>> Thanks for your review Coleen!  Responses interspersed below. New 
>>>> webrev is at: 
>>>> http://cr.openjdk.java.net/~lfoltan/bug_jdk8230199.2/webrev/index.html
>>>
>>> In the sharedRuntime_<cpu> changes I'm curious about the change to 
>>> the assertion:
>>>
>>> -        assert(in_sig_bt[i] == ss.type(), "must match");
>>> +        assert(in_sig_bt[i] == ss.type() ||
>>> +               in_sig_bt[i] == T_ARRAY, "must match");
>>>
>>> That would seem to be an existing bug - no? Given the code is:
>>>
>>> 2001       if (in_sig_bt[i] == T_ARRAY) {
>>> ...
>>> 2009       } else {
>>> 2010         out_sig_bt[argc++] = in_sig_bt[i];
>>> 2011         in_elem_bt[i] = T_VOID;
>>> 2012       }
>>> 2013       if (in_sig_bt[i] != T_VOID) {
>>> 2014         assert(in_sig_bt[i] == ss.type() ||
>>> 2015                in_sig_bt[i] == T_ARRAY, "must match");
>>>
>>> then we mustn't have any tests that actually exercise this otherwise 
>>> we would fail the assert! Or does ss.type() now behave differently 
>>> to before?
>>
>> The call to ss.skip_array_prefix(1) ahead of the ss.is_primitive() 
>> assert does cause ss.type() to return the underlying  type, not the 
>> original T_ARRAY.  Thus the following assert at line #2014 in the 
>> code sample above, in_sig_bt[i] would not equal ss.type() in the 
>> array case, so that check had to be added.
>
> Ah I see. Thanks for clarifying.
>
>>>
>>> 2008         // else what is in_elem_bt[i]?
>>>
>>> This comment/question doesn't seem to serve any purpose. If we think 
>>> there couldn't possibly be anything other than an array of 
>>> primitives then we should probably add some form of guarantee - just 
>>> as the original code caught that case with ShouldNotReachHere.
>>
>> I've removed the comment.  In a previous webrev I added the 
>> ss.is_primitive() assert based on Coleen's feedback.
>>
>>>
>>> ---
>>>
>>> src/hotspot/share/classfile/systemDictionary.cpp
>>>
>>> Minor nit:
>>>
>>> 2181       Symbol* obj_class = ss.as_symbol();
>>> 2182       klass = constraints()->find_constrained_klass(obj_class, 
>>> class_loader);
>>>
>>> You introduced the local obj_class but it is only used once and so 
>>> it not necessary. (In other places you removed use-once locals.)
>>
>> Fixed.
>>
>>>
>>> ---
>>>
>>> src/hotspot/share/classfile/verificationType.cpp
>>>
>>> Observation: The switch in get_component seems to indicate a missing 
>>> type2name style conversion helper.
>>>
>>>  134     case T_ARRAY:
>>>  135     case T_OBJECT: {
>>>  136       guarantee(ss.is_reference(), "unchecked verifier input?");
>>>  137       Symbol* component = ss.as_symbol();
>>>
>>>
>>> Is it that case as_symbol returns a class name without envelope, but 
>>> an array type name with envelop? I'm trying to reconcile the old and 
>>> new code where T_ARRAY and T_OBJECT were handled distinctly.
>>
>> That is correct except for your use of terminology.  An array type 
>> does not have an "envelope".  Only a class can have an envelope 
>> defined as a leading 'L' and ending ';'.   For a class, 
>> SignatureStream::as_symbol() returns a Symbol* without envelope (see 
>> calls to raw_symbol_begin and raw_symbol_end at the start of 
>> SignatureStream::find_symbol()).  For an array, 
>> SignatureStream::as_symbol() will leave the leading bracket, '['. And 
>> this behavior matches the current code in verificationType.cpp which 
>> calls create_temporary_symbol with a '1' begin position for an array 
>> but a '2' begin position for a class.
>
> Thanks for clarifying - the 1 and 2 offsets confused me in the 
> original code because I forgot we were already dealing with an array 
> whose component type may then be an array or a class. So given for 
> example an initial type:
>
> [[LFoo;  // Foo[][]
>
> we produce [LFoo;
>
> but given:
>
> [LFoo;  // Foo[]
>
> we produce Foo
>
> In the new code I'd overlooked the effects of ss.skip_array_prefix(1).
>
>>>
>>> ---
>>>
>>> src/hotspot/share/oops/constMethod.cpp
>>>
>>> !   set_result_type((BasicType)0);
>>>
>>> This change seems a little odd. Why 0? What about T_ILLEGAL if not 
>>> intended to be a real type?
>>
>> At the point a ConstMethod is constructed, the result type of the 
>> method has not yet been determined.  So just like setting the result 
>> type to T_VOID, T_ILLEGAL would also not be correct since this could 
>> in the end be a valid result type.  See the assert in 
>> ConstMethod::result_type(), we want that to trigger if the result 
>> type is either never set or incorrectly set.
>
> Okay. Maybe we should add "T_UNINITIALIZED = 0" to the enum?

I like this idea but am going to open a follow on RFE for it because it 
could be applicable to other places in the code as well and we should 
cover them all if we go forward with it.

>
>>>
>>> ---
>>>
>>> src/hotspot/share/prims/methodHandles.cpp
>>>
>>>  549     case T_VOID:
>>>  550     case T_INT: case T_LONG: case T_FLOAT: case T_DOUBLE:
>>>
>>> This is an inconsistent and unusual style. One case per line is the 
>>> common norm.
>>
>> Fixed.
>>
>>>
>>> ---
>>>
>>> src/hotspot/share/runtime/signature.hpp
>>>
>>> 57 // The primitive field types (like 'I') correspond 1-1 with type 
>>> codes
>>>  58 // (like T_INT) which part of the specfication of the 'newarray'
>>>  59 // instruction (JVMS 6.5, section on newarray).
>>>
>>> There is a word missing - perhaps "which form part of ..." ?
>>>
>>>  73 // "envelope" syntax which starts with 'L' (or 'Q') and ends 
>>> with ';'.
>>>
>>> Can we add 'Q' when/if we get Q-types please  :)
>>>
>>> 101   // determine if it (in fact) cannot be a class name.
>>>
>>> The use of "(in fact)" seems a little odd, and elsewhere is used 
>>> without parentheses.
>>>
>>>  389     case T_BYTE: case T_SHORT: case T_INT:
>>>  390     case T_BOOLEAN: case T_CHAR:
>>>
>>> Odd case style again.
>>>
>>>  397 #ifdef _LP64
>>>  398       pass_double(); _jni_offset++; _offset += 2;
>>>  399 #else
>>>  400       pass_double(); _jni_offset += 2; _offset += 2;
>>>  401 #endif
>>>
>>> How about:
>>>   int jni_offset = LP_64_ONLY(1) NOT_LP64(2)
>>>   pass_double(); _jni_offset += jni_offset; _offset += 2;
>>>
>>> Ditto for T_LONG case.
>>
>> All above fixed in signature.hpp.
>>
>>
>>>
>>>  528   Symbol* as_symbol() {
>>>  529     return find_symbol();
>>>  530   }
>>>
>>> Is the new as_symbol equivalent to the old as_symbol_or_null() ? It 
>>> doesn't appear to allow NULL return yet there are callsites that 
>>> still check the result for NULL (ie nmethod.cpp, method.cpp, 
>>> signature.cpp).
>>>
>>> Do we no longer need an allocation constrained version of as_symbol() ?
>>
>> We no longer need an allocation constrained version.  In current 
>> code, there are only 2 calls that use as_symbol_or_null.  In 
>> nmethod::print_nmethod_labels() - in this case I would think you 
>> would want the type of the parameter to be printed even if it wasn't 
>> in the symbol table.  And in 
>> Method::has_unloaded_classes_in_signature - in this case I think it 
>> is more accurate to hinge the determination on whether a class is 
>> unloaded or not on an actual call to SystemDictionary::find and not 
>> on whether it is in the symbol table.
>
> Sorry I'm still unclear whether as_symbol() can ever return NULL? If 
> it can't then the existing NULL checks should be removed. If it can 
> then many other uses may need to add NULL checks.

Signature::as_symbol can never return a NULL.  I have removed the 2 
checks for null in nmethod::print_nmethod_labels() and 
Method::has_unloaded_classes_in_signature().  See 
http://cr.openjdk.java.net/~lfoltan/bug_jdk8230199.4/webrev/index.html

Thanks,
Lois

>
> Thanks,
> David
>
>> Thanks,
>> Lois
>>
>>>
>>> ---
>>>
>>> Thanks,
>>> David
>>> -----
>>>
>>>
>>>>>
>>>>> http://cr.openjdk.java.net/~lfoltan/bug_jdk8230199.0/webrev/src/hotspot/cpu/sparc/sharedRuntime_sparc.cpp.frames.html 
>>>>>
>>>>>
>>>>> 1910 ss.skip_array_prefix(1); // skip one '['
>>>>> 1911 if (ss.is_primitive())
>>>>> 1912 in_elem_bt[i] = ss.type();
>>>>> 1913 // else what is in_elem_bt[i]?
>>>>>
>>>>> This code is for the not-well kept secret JavaCritical, where the 
>>>>> native function can only take an array of primitives (checked 
>>>>> elsewhere that I can't find right now).  So I believe this code 
>>>>> can just assert ss.is_primitive() for all these cases.  Only code 
>>>>> on solaris sparc uses this feature, so you can run tests on sparc 
>>>>> to verify.
>>>>
>>>> Done. Changed all to:
>>>>
>>>> out_sig_bt[argc++] = T_INT; out_sig_bt[argc++] = T_ADDRESS; 
>>>> ss.skip_array_prefix(1); // skip one '[' assert(ss.is_primitive(), 
>>>> "primitive type expected"); in_elem_bt[i] = ss.type(); // else what 
>>>> is in_elem_bt[i]?
>>>>
>>>> Will run a sparc build & test.
>>>>
>>>>>
>>>>> http://cr.openjdk.java.net/~lfoltan/bug_jdk8230199.0/webrev/src/hotspot/share/classfile/systemDictionary.cpp.frames.html 
>>>>>
>>>>>
>>>>> 2207 if (!Signature::is_array(class_name)) {
>>>>>
>>>>>
>>>>> Did you try adding SignatureStream before this, to avoid the 
>>>>> Symbol refcounting madness?
>>>>
>>>> Yes I did try that.  However, the issue with that approach is that 
>>>> you may have a non signature string in the case where it is not an 
>>>> array. This is the condition of the first part of the if statement 
>>>> on line #2207.  Here you can just simply use the Symbol* as the 
>>>> constraint name.  If you try to construct a SignatureStream with 
>>>> this non-signature Symbol* a crash occurs immediately. 
>>>> SignatureStream assumes the Symbol* is a field or method signature 
>>>> not just a plain name for example something like "java/lang/String" 
>>>> without a leading 'L' or a leading '('.
>>>>
>>>>>
>>>>> http://cr.openjdk.java.net/~lfoltan/bug_jdk8230199.0/webrev/src/hotspot/share/jvmci/jvmciCompilerToVM.hpp.frames.html 
>>>>>
>>>>>
>>>>> hpp files shouldn't include an inline.hpp file.  Can this include 
>>>>> oop.hpp instead?
>>>>
>>>> I removed the inclusion of oop.inline.hpp and removed the following 
>>>> assert in JavaArgumentUnboxer::do_type().
>>>>
>>>> 161     assert(arg->klass() == SystemDictionary::box_klass(type), 
>>>> "already checked");
>>>>
>>>> It turns out that in the statement prior to this assert, a call is 
>>>> made to JavaArgumentUnboxer::next_arg(type) which already does the 
>>>> check for us.  Thus the "already checked" comment.  No need for a 
>>>> duplicate assert.
>>>>
>>>>>
>>>>> http://cr.openjdk.java.net/~lfoltan/bug_jdk8230199.0/webrev/src/hotspot/share/oops/symbol.cpp.frames.html 
>>>>>
>>>>>
>>>>> Should SignatureStream arguments be const references?  Or should 
>>>>> they be written to?
>>>>
>>>> I was able to add a const to the SignatureStream& parameter for the 
>>>> static method print_class().  For print_array(), however, I was not 
>>>> able to since ss.skip_array_prefix() does write to 
>>>> SignatureStream::_type field the type after skipping the array 
>>>> brackets.
>>>>
>>>>>
>>>>> This looks really good!
>>>>
>>>> Thanks again!
>>>> Lois
>>>>
>>>>> Coleen
>>>>>
>>>>>
>>>>> On 1/24/20 4:41 PM, Lois Foltan wrote:
>>>>>> Please review the following enhancement to consolidate signature 
>>>>>> parsing code in Hotspot sources.  This change removes duplicate 
>>>>>> blocks of code that parse field or method signatures, provides a 
>>>>>> new Signature class to support basic signature queries on Symbol 
>>>>>> operands and enhances the SignatureStream class to parse field 
>>>>>> signatures in addition to methods.
>>>>>>
>>>>>> open webrev 
>>>>>> at:http://cr.openjdk.java.net/~lfoltan/bug_jdk8230199.0/webrev/ 
>>>>>> <http://cr.openjdk.java.net/~lfoltan/bug_jdk8230199.0/webrev/>
>>>>>> bug link: https://bugs.openjdk.java.net/browse/JDK-8230199
>>>>>> contributed-by: Lois Foltan, John Rose
>>>>>>
>>>>>> Testing: hs-tier1-8, jdk-tier1-2
>>>>>>
>>>>>> Thanks,
>>>>>> Lois
>>>>>
>>>>
>>



More information about the hotspot-runtime-dev mailing list