RFR 8209138: Symbol constructor uses u1 as the element type of its name argument

Harold David Seigel harold.seigel at oracle.com
Mon Oct 1 20:23:55 UTC 2018


Hi Coleen,

Please review this updated webrev:

    http://cr.openjdk.java.net/~hseigel/bug_8209138.3/webrev/index.html

It uses u1 for Symbol instead of char.  I also added a "char char_at(int 
index)" function to class Symbol.  It's for callers that want a char 
from a Symbol instead of a u1.

Thanks! Harold


On 9/27/2018 10:03 AM, coleen.phillimore at oracle.com wrote:
>
> Hi Harold,  I think we've decided to make u1 the type we carry around 
> in Symbol instead of char, since that's more accurate to represent 
> utf8.  Since the goal is to make the types consistent, can you try to 
> change it all to u1 instead.  Keeping the name get_byte() and see how 
> bad the casting situation is?
> thank you!
> Coleen
>
>
> On 9/27/18 9:17 AM, Harold David Seigel wrote:
>> Hi David,
>>
>> Please review this updated webrev:
>>
>> http://cr.openjdk.java.net/~hseigel/bug_8209138.2/webrev/index.html
>>
>> It contains additional changes to use char for class Symbol.
>>
>> Hopefully, your concerns w.r.t. signed and unsigned types were 
>> resolved by our offline discussions.
>>
>> Thanks, Harold
>>
>>
>> On 8/21/2018 2:24 AM, David Holmes wrote:
>>> On 21/08/2018 3:19 PM, Kim Barrett wrote:
>>>>> On Aug 20, 2018, at 8:53 PM, David Holmes 
>>>>> <david.holmes at oracle.com> wrote:
>>>>>
>>>>> Hi Harold,
>>>>>
>>>>> On 21/08/2018 12:43 AM, Harold David Seigel wrote:
>>>>>> Hi,
>>>>>> Please review this change for bug JDK-8209138.  The fix changes 
>>>>>> class Symbol in symbol.hpp to use type char instead of types u1 
>>>>>> and jbyte and renames relevant functions by replacing 'byte' with 
>>>>>> 'char'.  For example, 'Symbol::byte_at_put()' is now 
>>>>>> 'Symbol::char_at_put()'.
>>>>>
>>>>> It's not clear to me exactly what all these things should be. In a 
>>>>> lot of places we seem to be dealing with UTF8 character 
>>>>> representations, which to me should be u1 (afterall that is how 
>>>>> they are specified in the classfile format!) which is just an 
>>>>> unsigned "byte". But char is signed (in our build).
>>>>
>>>> Not sure what’s meant by “in our build” but char is unsigned when 
>>>> building HotSpot for aarch64 / arm.  See, for example, JDK-8209586.
>>>
>>> Ah! We have always set -fsigned-char for the Oracle ARM and ARM64 
>>> builds (and previously PPC32), but now I see that, as you say, 
>>> Aarch64 does not set this (nor PPC64). Interesting and unfortunate. 
>>> I'm somewhat surprised that we haven't encountered more issues due 
>>> to this.
>>>
>>> In any case we are still converting from an unsigned type to a 
>>> signed type in many cases with this change.
>>>
>>> David
>>> -----
>>>
>>>>
>>>>> The current change just seems to push out the boundary where we 
>>>>> convert between things. For example ciSymbol now has a char 
>>>>> base()** but still has a byte_at() method - should that not know 
>>>>> be char_at() for consistency? Especially given byte_at() returns 
>>>>> get_symbol()->char-at()
>>>>>
>>>>> ** Granted the existing jbyte seems the wrong choice too.
>>>>>
>>>>> I think we have a lot of type confusion in our code, but it isn't 
>>>>> clear to me that this particular change is necessarily changing 
>>>>> things in the right direction.
>>>>>
>>>>> Cheers,
>>>>> David
>>>>>
>>>>>
>>>>>> Open Webrev: 
>>>>>> http://cr.openjdk.java.net/~hseigel/bug_8209138/webrev/index.html
>>>>>> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8209138
>>>>>> The change was tested by running Mach5 tiers 1 and 2 tests and 
>>>>>> builds on Linux-x64, Windows, and Mac OS X, running tiers 3-5 
>>>>>> tests on Linux-x64, and by running JCK-11 API, Lang and VM tests 
>>>>>> on Linux-x64.
>>>>>> Thanks, Harold
>>>>
>>>>
>>
>



More information about the hotspot-runtime-dev mailing list