RFR 8209138: Symbol constructor uses u1 as the element type of its name argument

Harold David Seigel harold.seigel at oracle.com
Wed Oct 3 12:35:05 UTC 2018


Thanks Coleen!

Harold


On 10/3/2018 8:38 AM, coleen.phillimore at oracle.com wrote:
>
> Harold, this is nicely done!  Thank you for perservering in this 
> cleanup and following through on the issues raised.  I agree not to 
> change more in the SA if it works.
> Thanks,
> Coleen
>
> On 10/2/18 4:51 PM, Harold David Seigel wrote:
>> Hi David,
>>
>> Thanks for looking at this again.
>>
>> Please see this latest webrev and also in-line comments below:
>>
>>    http://cr.openjdk.java.net/~hseigel/bug_8209138.4/webrev/
>>
>>
>> On 10/1/2018 9:45 PM, David Holmes wrote:
>>> Hi Harold,
>>>
>>> Thanks for persevering with this!
>> Thanks!
>>>
>>> On 2/10/2018 6:23 AM, Harold David Seigel wrote:
>>>> Hi Coleen,
>>>>
>>>> Please review this updated webrev:
>>>>
>>>> http://cr.openjdk.java.net/~hseigel/bug_8209138.3/webrev/index.html
>>>>
>>>> It uses u1 for Symbol instead of char.  I also added a "char 
>>>> char_at(int index)" function to class Symbol.  It's for callers 
>>>> that want a char from a Symbol instead of a u1.
>>>
>>> I was in two minds about this as we really are comparing the i'th 
>>> byte with a given char value. But I suppose if I read char_at(i) as 
>>> byte_as_char_at(i) then it's just a type-conversion operation. 
>>> Though I am curious what byte_at usages remain after this change?
>> There were a few uses of byte_at() in heapDumper.cpp, 
>> jvmciCompilerToVM.cpp, and jvmtiTagMap.cpp.  I changed these calls to 
>> char_at() because the function results were being treated as chars.  
>> Since there were no more uses of Symbol::byte_at() and 
>> ciSymbol::byte_at(), I removed them.
>>>
>>> src/hotspot/share/ci/ciSymbol.hpp
>>>
>>> Can you humour me and please change this comment:
>>>
>>>  // Return the i-th char, where i < utf8_length
>>>
>>> to:
>>>
>>>  // Return the i-th utf byte as a char, where i < utf8_length
>> Done
>>>
>>> ---
>>>
>>> src/hotspot/share/oops/symbol.hpp
>>>
>>>    char char_at(int index) const {
>>>      assert(index >=0 && index < length(), "symbol index overflow");
>>>      return base()[index];
>>>    }
>>>
>>> I'm surprised this doesn't need a (char) cast to keep the compiler 
>>> happy. Some compilers may need it I think and it doesn't hurt to add 
>>> it.
>> I added the cast.
>>>
>>> ---
>>>
>>> src/hotspot/share/runtime/vmStructs.cpp
>>>
>>> There's a comment in:
>>>
>>> ./jdk.hotspot.agent/share/classes/sun/jvm/hotspot/types/Field.java
>>>
>>> that needs updating:
>>>
>>>  FIXME: among other things, this interface is not sufficient to
>>>  describe fields which are themselves arrays (like Symbol's
>>>  jbyte _body[1]).
>>>
>>> s/jbyte/u1/
>> Done
>>>
>>> Also:
>>>
>>> ./jdk.hotspot.agent/share/classes/sun/jvm/hotspot/oops/Symbol.java
>>>
>>> accesses the array as jbytes - that may need adjusting.
>> The serviceability agent builds okay and passes all its tests. So, 
>> I'd prefer not to open this potential can of worms.
>>>
>>> ----
>>>
>>> Otherwise all seems fine.
>> Thanks!
>> Harold
>>>
>>> Thanks,
>>> David
>>> -----
>>>
>>>
>>>> Thanks! Harold
>>>>
>>>>
>>>> On 9/27/2018 10:03 AM, coleen.phillimore at oracle.com wrote:
>>>>>
>>>>> Hi Harold,  I think we've decided to make u1 the type we carry 
>>>>> around in Symbol instead of char, since that's more accurate to 
>>>>> represent utf8.  Since the goal is to make the types consistent, 
>>>>> can you try to change it all to u1 instead. Keeping the name 
>>>>> get_byte() and see how bad the casting situation is?
>>>>> thank you!
>>>>> Coleen
>>>>>
>>>>>
>>>>> On 9/27/18 9:17 AM, Harold David Seigel wrote:
>>>>>> Hi David,
>>>>>>
>>>>>> Please review this updated webrev:
>>>>>>
>>>>>> http://cr.openjdk.java.net/~hseigel/bug_8209138.2/webrev/index.html
>>>>>>
>>>>>> It contains additional changes to use char for class Symbol.
>>>>>>
>>>>>> Hopefully, your concerns w.r.t. signed and unsigned types were 
>>>>>> resolved by our offline discussions.
>>>>>>
>>>>>> Thanks, Harold
>>>>>>
>>>>>>
>>>>>> On 8/21/2018 2:24 AM, David Holmes wrote:
>>>>>>> On 21/08/2018 3:19 PM, Kim Barrett wrote:
>>>>>>>>> On Aug 20, 2018, at 8:53 PM, David Holmes 
>>>>>>>>> <david.holmes at oracle.com> wrote:
>>>>>>>>>
>>>>>>>>> Hi Harold,
>>>>>>>>>
>>>>>>>>> On 21/08/2018 12:43 AM, Harold David Seigel wrote:
>>>>>>>>>> Hi,
>>>>>>>>>> Please review this change for bug JDK-8209138. The fix 
>>>>>>>>>> changes class Symbol in symbol.hpp to use type char instead 
>>>>>>>>>> of types u1 and jbyte and renames relevant functions by 
>>>>>>>>>> replacing 'byte' with 'char'. For example, 
>>>>>>>>>> 'Symbol::byte_at_put()' is now 'Symbol::char_at_put()'.
>>>>>>>>>
>>>>>>>>> It's not clear to me exactly what all these things should be. 
>>>>>>>>> In a lot of places we seem to be dealing with UTF8 character 
>>>>>>>>> representations, which to me should be u1 (afterall that is 
>>>>>>>>> how they are specified in the classfile format!) which is just 
>>>>>>>>> an unsigned "byte". But char is signed (in our build).
>>>>>>>>
>>>>>>>> Not sure what’s meant by “in our build” but char is unsigned 
>>>>>>>> when building HotSpot for aarch64 / arm. See, for example, 
>>>>>>>> JDK-8209586.
>>>>>>>
>>>>>>> Ah! We have always set -fsigned-char for the Oracle ARM and 
>>>>>>> ARM64 builds (and previously PPC32), but now I see that, as you 
>>>>>>> say, Aarch64 does not set this (nor PPC64). Interesting and 
>>>>>>> unfortunate. I'm somewhat surprised that we haven't encountered 
>>>>>>> more issues due to this.
>>>>>>>
>>>>>>> In any case we are still converting from an unsigned type to a 
>>>>>>> signed type in many cases with this change.
>>>>>>>
>>>>>>> David
>>>>>>> -----
>>>>>>>
>>>>>>>>
>>>>>>>>> The current change just seems to push out the boundary where 
>>>>>>>>> we convert between things. For example ciSymbol now has a char 
>>>>>>>>> base()** but still has a byte_at() method - should that not 
>>>>>>>>> know be char_at() for consistency? Especially given byte_at() 
>>>>>>>>> returns get_symbol()->char-at()
>>>>>>>>>
>>>>>>>>> ** Granted the existing jbyte seems the wrong choice too.
>>>>>>>>>
>>>>>>>>> I think we have a lot of type confusion in our code, but it 
>>>>>>>>> isn't clear to me that this particular change is necessarily 
>>>>>>>>> changing things in the right direction.
>>>>>>>>>
>>>>>>>>> Cheers,
>>>>>>>>> David
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> Open Webrev: 
>>>>>>>>>> http://cr.openjdk.java.net/~hseigel/bug_8209138/webrev/index.html 
>>>>>>>>>>
>>>>>>>>>> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8209138
>>>>>>>>>> The change was tested by running Mach5 tiers 1 and 2 tests 
>>>>>>>>>> and builds on Linux-x64, Windows, and Mac OS X, running tiers 
>>>>>>>>>> 3-5 tests on Linux-x64, and by running JCK-11 API, Lang and 
>>>>>>>>>> VM tests on Linux-x64.
>>>>>>>>>> Thanks, Harold
>>>>>>>>
>>>>>>>>
>>>>>>
>>>>>
>>>>
>>
>



More information about the hotspot-runtime-dev mailing list