RFR (L) JDK-8230199: consolidate signature parsing code in HotSpot sources

David Holmes david.holmes at oracle.com
Tue Feb 4 04:07:57 UTC 2020


Hi Lois,

On 4/02/2020 7:39 am, Lois Foltan wrote:
> On 2/3/2020 12:18 AM, David Holmes wrote:
>> Hi Lois,
>>
>> Sorry for the delay in looking at this. This is a nice set of 
>> consolidations and simplifications. I few mostly minor 
>> comments/queries below.
> 
> Hi David,
> 
> No problem.  Thank you for reviewing!  Comments interspersed below. 

Responses also below.

> Updated webrev that addresses your concerns below is at: 
> http://cr.openjdk.java.net/~lfoltan/bug_jdk8230199.3/webrev/index.html

Thanks. I didn't spot any further issues.

>>
>> On 29/01/2020 6:50 am, Lois Foltan wrote:
>>> On 1/27/2020 4:36 PM, coleen.phillimore at oracle.com wrote:
>>>>
>>>> Lois,  So many nice cleanups in this patch!
>>>
>>> Thanks for your review Coleen!  Responses interspersed below. New 
>>> webrev is at: 
>>> http://cr.openjdk.java.net/~lfoltan/bug_jdk8230199.2/webrev/index.html
>>
>> In the sharedRuntime_<cpu> changes I'm curious about the change to the 
>> assertion:
>>
>> -        assert(in_sig_bt[i] == ss.type(), "must match");
>> +        assert(in_sig_bt[i] == ss.type() ||
>> +               in_sig_bt[i] == T_ARRAY, "must match");
>>
>> That would seem to be an existing bug - no? Given the code is:
>>
>> 2001       if (in_sig_bt[i] == T_ARRAY) {
>> ...
>> 2009       } else {
>> 2010         out_sig_bt[argc++] = in_sig_bt[i];
>> 2011         in_elem_bt[i] = T_VOID;
>> 2012       }
>> 2013       if (in_sig_bt[i] != T_VOID) {
>> 2014         assert(in_sig_bt[i] == ss.type() ||
>> 2015                in_sig_bt[i] == T_ARRAY, "must match");
>>
>> then we mustn't have any tests that actually exercise this otherwise 
>> we would fail the assert! Or does ss.type() now behave differently to 
>> before?
> 
> The call to ss.skip_array_prefix(1) ahead of the ss.is_primitive() 
> assert does cause ss.type() to return the underlying  type, not the 
> original T_ARRAY.  Thus the following assert at line #2014 in the code 
> sample above, in_sig_bt[i] would not equal ss.type() in the array case, 
> so that check had to be added.

Ah I see. Thanks for clarifying.

>>
>> 2008         // else what is in_elem_bt[i]?
>>
>> This comment/question doesn't seem to serve any purpose. If we think 
>> there couldn't possibly be anything other than an array of primitives 
>> then we should probably add some form of guarantee - just as the 
>> original code caught that case with ShouldNotReachHere.
> 
> I've removed the comment.  In a previous webrev I added the 
> ss.is_primitive() assert based on Coleen's feedback.
> 
>>
>> ---
>>
>> src/hotspot/share/classfile/systemDictionary.cpp
>>
>> Minor nit:
>>
>> 2181       Symbol* obj_class = ss.as_symbol();
>> 2182       klass = constraints()->find_constrained_klass(obj_class, 
>> class_loader);
>>
>> You introduced the local obj_class but it is only used once and so it 
>> not necessary. (In other places you removed use-once locals.)
> 
> Fixed.
> 
>>
>> ---
>>
>> src/hotspot/share/classfile/verificationType.cpp
>>
>> Observation: The switch in get_component seems to indicate a missing 
>> type2name style conversion helper.
>>
>>  134     case T_ARRAY:
>>  135     case T_OBJECT: {
>>  136       guarantee(ss.is_reference(), "unchecked verifier input?");
>>  137       Symbol* component = ss.as_symbol();
>>
>>
>> Is it that case as_symbol returns a class name without envelope, but 
>> an array type name with envelop? I'm trying to reconcile the old and 
>> new code where T_ARRAY and T_OBJECT were handled distinctly.
> 
> That is correct except for your use of terminology.  An array type does 
> not have an "envelope".  Only a class can have an envelope defined as a 
> leading 'L' and ending ';'.   For a class, SignatureStream::as_symbol() 
> returns a Symbol* without envelope (see calls to raw_symbol_begin and 
> raw_symbol_end at the start of SignatureStream::find_symbol()).  For an 
> array, SignatureStream::as_symbol() will leave the leading bracket, '['. 
> And this behavior matches the current code in verificationType.cpp which 
> calls create_temporary_symbol with a '1' begin position for an array but 
> a '2' begin position for a class.

Thanks for clarifying - the 1 and 2 offsets confused me in the original 
code because I forgot we were already dealing with an array whose 
component type may then be an array or a class. So given for example an 
initial type:

[[LFoo;  // Foo[][]

we produce [LFoo;

but given:

[LFoo;  // Foo[]

we produce Foo

In the new code I'd overlooked the effects of ss.skip_array_prefix(1).

>>
>> ---
>>
>> src/hotspot/share/oops/constMethod.cpp
>>
>> !   set_result_type((BasicType)0);
>>
>> This change seems a little odd. Why 0? What about T_ILLEGAL if not 
>> intended to be a real type?
> 
> At the point a ConstMethod is constructed, the result type of the method 
> has not yet been determined.  So just like setting the result type to 
> T_VOID, T_ILLEGAL would also not be correct since this could in the end 
> be a valid result type.  See the assert in ConstMethod::result_type(), 
> we want that to trigger if the result type is either never set or 
> incorrectly set.

Okay. Maybe we should add "T_UNINITIALIZED = 0" to the enum?

>>
>> ---
>>
>> src/hotspot/share/prims/methodHandles.cpp
>>
>>  549     case T_VOID:
>>  550     case T_INT: case T_LONG: case T_FLOAT: case T_DOUBLE:
>>
>> This is an inconsistent and unusual style. One case per line is the 
>> common norm.
> 
> Fixed.
> 
>>
>> ---
>>
>> src/hotspot/share/runtime/signature.hpp
>>
>> 57 // The primitive field types (like 'I') correspond 1-1 with type codes
>>  58 // (like T_INT) which part of the specfication of the 'newarray'
>>  59 // instruction (JVMS 6.5, section on newarray).
>>
>> There is a word missing - perhaps "which form part of ..." ?
>>
>>  73 // "envelope" syntax which starts with 'L' (or 'Q') and ends with 
>> ';'.
>>
>> Can we add 'Q' when/if we get Q-types please  :)
>>
>> 101   // determine if it (in fact) cannot be a class name.
>>
>> The use of "(in fact)" seems a little odd, and elsewhere is used 
>> without parentheses.
>>
>>  389     case T_BYTE: case T_SHORT: case T_INT:
>>  390     case T_BOOLEAN: case T_CHAR:
>>
>> Odd case style again.
>>
>>  397 #ifdef _LP64
>>  398       pass_double(); _jni_offset++; _offset += 2;
>>  399 #else
>>  400       pass_double(); _jni_offset += 2; _offset += 2;
>>  401 #endif
>>
>> How about:
>>   int jni_offset = LP_64_ONLY(1) NOT_LP64(2)
>>   pass_double(); _jni_offset += jni_offset; _offset += 2;
>>
>> Ditto for T_LONG case.
> 
> All above fixed in signature.hpp.
> 
> 
>>
>>  528   Symbol* as_symbol() {
>>  529     return find_symbol();
>>  530   }
>>
>> Is the new as_symbol equivalent to the old as_symbol_or_null() ? It 
>> doesn't appear to allow NULL return yet there are callsites that still 
>> check the result for NULL (ie nmethod.cpp, method.cpp, signature.cpp).
>>
>> Do we no longer need an allocation constrained version of as_symbol() ?
> 
> We no longer need an allocation constrained version.  In current code, 
> there are only 2 calls that use as_symbol_or_null.  In 
> nmethod::print_nmethod_labels() - in this case I would think you would 
> want the type of the parameter to be printed even if it wasn't in the 
> symbol table.  And in Method::has_unloaded_classes_in_signature - in 
> this case I think it is more accurate to hinge the determination on 
> whether a class is unloaded or not on an actual call to 
> SystemDictionary::find and not on whether it is in the symbol table.

Sorry I'm still unclear whether as_symbol() can ever return NULL? If it 
can't then the existing NULL checks should be removed. If it can then 
many other uses may need to add NULL checks.

Thanks,
David

> Thanks,
> Lois
> 
>>
>> ---
>>
>> Thanks,
>> David
>> -----
>>
>>
>>>>
>>>> http://cr.openjdk.java.net/~lfoltan/bug_jdk8230199.0/webrev/src/hotspot/cpu/sparc/sharedRuntime_sparc.cpp.frames.html 
>>>>
>>>>
>>>> 1910 ss.skip_array_prefix(1); // skip one '['
>>>> 1911 if (ss.is_primitive())
>>>> 1912 in_elem_bt[i] = ss.type();
>>>> 1913 // else what is in_elem_bt[i]?
>>>>
>>>> This code is for the not-well kept secret JavaCritical, where the 
>>>> native function can only take an array of primitives (checked 
>>>> elsewhere that I can't find right now).  So I believe this code can 
>>>> just assert ss.is_primitive() for all these cases.  Only code on 
>>>> solaris sparc uses this feature, so you can run tests on sparc to 
>>>> verify.
>>>
>>> Done. Changed all to:
>>>
>>> out_sig_bt[argc++] = T_INT; out_sig_bt[argc++] = T_ADDRESS; 
>>> ss.skip_array_prefix(1); // skip one '[' assert(ss.is_primitive(), 
>>> "primitive type expected"); in_elem_bt[i] = ss.type(); // else what 
>>> is in_elem_bt[i]?
>>>
>>> Will run a sparc build & test.
>>>
>>>>
>>>> http://cr.openjdk.java.net/~lfoltan/bug_jdk8230199.0/webrev/src/hotspot/share/classfile/systemDictionary.cpp.frames.html 
>>>>
>>>>
>>>> 2207 if (!Signature::is_array(class_name)) {
>>>>
>>>>
>>>> Did you try adding SignatureStream before this, to avoid the Symbol 
>>>> refcounting madness?
>>>
>>> Yes I did try that.  However, the issue with that approach is that 
>>> you may have a non signature string in the case where it is not an 
>>> array. This is the condition of the first part of the if statement on 
>>> line #2207.  Here you can just simply use the Symbol* as the 
>>> constraint name.  If you try to construct a SignatureStream with this 
>>> non-signature Symbol* a crash occurs immediately. SignatureStream 
>>> assumes the Symbol* is a field or method signature not just a plain 
>>> name for example something like "java/lang/String" without a leading 
>>> 'L' or a leading '('.
>>>
>>>>
>>>> http://cr.openjdk.java.net/~lfoltan/bug_jdk8230199.0/webrev/src/hotspot/share/jvmci/jvmciCompilerToVM.hpp.frames.html 
>>>>
>>>>
>>>> hpp files shouldn't include an inline.hpp file.  Can this include 
>>>> oop.hpp instead?
>>>
>>> I removed the inclusion of oop.inline.hpp and removed the following 
>>> assert in JavaArgumentUnboxer::do_type().
>>>
>>> 161     assert(arg->klass() == SystemDictionary::box_klass(type), 
>>> "already checked");
>>>
>>> It turns out that in the statement prior to this assert, a call is 
>>> made to JavaArgumentUnboxer::next_arg(type) which already does the 
>>> check for us.  Thus the "already checked" comment.  No need for a 
>>> duplicate assert.
>>>
>>>>
>>>> http://cr.openjdk.java.net/~lfoltan/bug_jdk8230199.0/webrev/src/hotspot/share/oops/symbol.cpp.frames.html 
>>>>
>>>>
>>>> Should SignatureStream arguments be const references?  Or should 
>>>> they be written to?
>>>
>>> I was able to add a const to the SignatureStream& parameter for the 
>>> static method print_class().  For print_array(), however, I was not 
>>> able to since ss.skip_array_prefix() does write to 
>>> SignatureStream::_type field the type after skipping the array brackets.
>>>
>>>>
>>>> This looks really good!
>>>
>>> Thanks again!
>>> Lois
>>>
>>>> Coleen
>>>>
>>>>
>>>> On 1/24/20 4:41 PM, Lois Foltan wrote:
>>>>> Please review the following enhancement to consolidate signature 
>>>>> parsing code in Hotspot sources.  This change removes duplicate 
>>>>> blocks of code that parse field or method signatures, provides a 
>>>>> new Signature class to support basic signature queries on Symbol 
>>>>> operands and enhances the SignatureStream class to parse field 
>>>>> signatures in addition to methods.
>>>>>
>>>>> open webrev 
>>>>> at:http://cr.openjdk.java.net/~lfoltan/bug_jdk8230199.0/webrev/ 
>>>>> <http://cr.openjdk.java.net/~lfoltan/bug_jdk8230199.0/webrev/>
>>>>> bug link: https://bugs.openjdk.java.net/browse/JDK-8230199
>>>>> contributed-by: Lois Foltan, John Rose
>>>>>
>>>>> Testing: hs-tier1-8, jdk-tier1-2
>>>>>
>>>>> Thanks,
>>>>> Lois
>>>>
>>>
> 


More information about the hotspot-runtime-dev mailing list