RFR(XXS) 8249087 Symbol constructor unnecessarily initializes _body[0]

Ioi Lam ioi.lam at oracle.com
Tue Jul 21 03:26:51 UTC 2020



On 7/20/20 7:57 PM, David Holmes wrote:
> Hi Ioi,
>
> On 21/07/2020 12:50 pm, Ioi Lam wrote:
>>
>>
>> On 7/20/20 7:36 PM, David Holmes wrote:
>>> Hi Ioi,
>>>
>>> On 21/07/2020 10:12 am, Ioi Lam wrote:
>>>> Hi please review this very simple fix:
>>>>
>>>> diff -r 19f26d72a8d0 src/hotspot/share/oops/symbol.cpp
>>>> --- a/src/hotspot/share/oops/symbol.cpp    Mon Jul 20 14:24:19 2020 
>>>> -0700
>>>> +++ b/src/hotspot/share/oops/symbol.cpp    Mon Jul 20 17:11:57 2020 
>>>> -0700
>>>> @@ -51,8 +51,11 @@
>>>>   Symbol::Symbol(const u1* name, int length, int refcount) {
>>>>     _hash_and_refcount = 
>>>> pack_hash_and_refcount((short)os::random(), refcount);
>>>>     _length = length;
>>>> -  _body[0] = 0;  // in case length == 0
>>>>     memcpy(_body, name, length);
>>>> +  // For symbols of length 0 and 1: _body[0] (and _body[1]) are 
>>>> uninitialized and may
>>>
>>> Can we ever have a Symbol of length zero?
>>>
>>> If the Symbol name is length 1 then surely _body[0] is initialized 
>>> to the single character of that name?
>>>
>>> The change seems harmless given a zero length symbol is meaningless, 
>>> but the commentary just confuses things to me.
>>>
>>
>> Hi David,
>>
>> We can have a valid Symbol of length 0. All UTF8 constants in 
>> classfiles, including the literal string "", are represented as Symbols.
>
> Ah I see.
>> How about
>>
>> // Random, uninitialized values may appear in _body[0] and _body[1]
>> // for Symbols of length 0 and 1. These random values never change 
>> during
>> // the lifetime of the Symbol, and are read only by 
>> Symbol::identity_hash(),
>
> What if as_quoted_ascii() were called on the zero-length symbol? That 
> reads _body[0] unconditionally. The same for use of base(). ??

Hi David,

as_quoted_ascii() and base() do not read the contents of _body[0]. They 
take the address of _body[0]:

char* Symbol::as_quoted_ascii() const {
   const char *ptr = (const char *)&_body[0];
   int quoted_length = UTF8::quoted_ascii_length(ptr, utf8_length());
   char* result = NEW_RESOURCE_ARRAY(char, quoted_length + 1);
   UTF8::as_quoted_ascii(ptr, utf8_length(), result, quoted_length + 1);
   return result;
}

   const u1* base() const { return &_body[0]; }

Thanks
- Ioi



>
> Thanks,
> David
> -----
>
>> // which would tolerate such randomness.
>>
>>
>> Thanks
>> - Ioi
>>> Thanks,
>>> David
>>> -----
>>>
>>>> +  // contain random values, which will only be read by 
>>>> Symbol::identity_hash(),
>>>> +  // which would tolerate such randomness. These values never 
>>>> change during the lifetime
>>>> +  // of the Symbol.
>>>>   }
>>>>
>>>>
>>>> Passed hs tiers 1/2. Running tiers 3/4 now.
>>>>
>>>> Thanks
>>>> - Ioi
>>



More information about the hotspot-runtime-dev mailing list