RFR(XXS) 8249087 Symbol constructor unnecessarily initializes _body[0]

Lois Foltan lois.foltan at oracle.com
Wed Jul 22 18:54:52 UTC 2020


On 7/22/2020 1:42 PM, Ioi Lam wrote:
>
>
> On 7/22/20 10:12 AM, Lois Foltan wrote:
>> On 7/22/2020 12:17 AM, David Holmes wrote:
>>> Hi Lois,
>>>
>>> On 22/07/2020 1:06 am, Lois Foltan wrote:
>>>> On 7/21/2020 2:24 AM, Ioi Lam wrote:
>>>>>
>>>>>
>>>>> On 7/20/20 11:12 PM, Florian Weimer wrote:
>>>>>> * Ioi Lam:
>>>>>>
>>>>>>> Hi please review this very simple fix:
>>>>>>>
>>>>>>> diff -r 19f26d72a8d0 src/hotspot/share/oops/symbol.cpp
>>>>>>> --- a/src/hotspot/share/oops/symbol.cpp    Mon Jul 20 14:24:19 
>>>>>>> 2020 -0700
>>>>>>> +++ b/src/hotspot/share/oops/symbol.cpp    Mon Jul 20 17:11:57 
>>>>>>> 2020 -0700
>>>>>>> @@ -51,8 +51,11 @@
>>>>>>>   Symbol::Symbol(const u1* name, int length, int refcount) {
>>>>>>>     _hash_and_refcount = 
>>>>>>> pack_hash_and_refcount((short)os::random(),
>>>>>>> refcount);
>>>>>>>     _length = length;
>>>>>>> -  _body[0] = 0;  // in case length == 0
>>>>>>>     memcpy(_body, name, length);
>>>>>>> +  // For symbols of length 0 and 1: _body[0] (and _body[1]) are
>>>>>>> uninitialized and may
>>>>>>> +  // contain random values, which will only be read by
>>>>>>> Symbol::identity_hash(),
>>>>>>> +  // which would tolerate such randomness. These values never 
>>>>>>> change
>>>>>>> during the lifetime
>>>>>>> +  // of the Symbol.
>>>>>>>   }
>>>>>> Won't this still trip memory debuggers? Symbol::identity_hash() 
>>>>>> implies
>>>>>> that the result is eventually used in a conditional operation (a 
>>>>>> hash
>>>>>> comparison perhaps).  If it's possible one day to run Hotspot under
>>>>>> valgrind, this would result in false positives.
>>>>>
>>>>> Are you saying that valgrind will modify uninitialized memory 
>>>>> periodically after the constructor has returned, and thus will 
>>>>> cause Symbol::identity_hash() to return a different value?
>>>>>
>>>>> Without my patch, _body[1] is uninitialized for Symbols whose 
>>>>> length is 0 or 1. We have not heard of any issues related to 
>>>>> valgrind and Symbol::identity_hash().
>>>>>
>>>>> In fact, looking at the code history, the setting of "_body[0] = 
>>>>> 0" in Symbol::Symbol was introduced only recently (Feb 2020):
>>>>>
>>>>> http://hg.openjdk.java.net/jdk/jdk/annotate/4a4d185098e2/src/hotspot/share/oops/symbol.cpp#l55 
>>>>>
>>>>>
>>>>> I'll check with Lois who added the code to see the reason for 
>>>>> doing it.
>>>>
>>>> Hi Ioi,
>>>>
>>>> Reviewing this JBS issue, I have concerns over leaving both 
>>>> _body[0] and now even _body[1] uninitialized.  The signature 
>>>> processing frequently checks the first character of a Symbol via 
>>>> Symbol::char_at(0) to determine what type it is dealing with.  Is 
>>>> there a danger that the uninitialized memory actually has a valid 
>>>> type indicator in it like an '[' character for example?  The 
>>>> signature processing could potentially make wrong assumptions about 
>>>> the type it is trying to process.
>>>
>>> Aren't all the signature related symbols already guaranteed to not 
>>> have zero-length, or else the length is being pre-tested for zero?
>>
>> Hi David,
>>
>> I believe you are correct in that signature related symbols are 
>> already guaranteed to not have zero-length.  I tried several 
>> different jasm files to introduce an empty string either for a class 
>> name or for a type of a field reference and did not get past class 
>> file parsing without a ClassFormatError.
>>
>> Error: LinkageError occurred while loading main class HelloEmptyString
>>         java.lang.ClassFormatError: Class name is empty or contains 
>> illegal character in descriptor in class file HelloEmptyString
>>
>>
>> However, after reading the original issue in JDK-8249087, the 
>> objection wasn't that there was a needless initialization of _body[0] 
>> but instead it was around the inconsistency of initializing _body[0] 
>> and not _body[1].  I think it should be the responsibility of the 
>> Symbol class API to ensure initialization so that as a consumer I 
>> don't have to worry about _body[0] or _body[1]'s validity.  So I 
>> prefer the change to actually add the initialization of _body[1].  
>> Something like, "_body[0] = _body[1] = 0".
>>
> Hi Lois,
>
> The fact that _body[0..1] is in the Symbol header is just by 
> coincidence -- we need only 6 bytes of meta-info about the Symbol, so 
> we have 2 bytes left over. We use these two left-over bytes for 
> Symbol::identity_hash(). However, no one else should unconditionally 
> read these 2 bytes --  if we change the Symbol header in the future, 
> these two bytes may not be allocated anymore
>
> It looks like leaving _body[0..1] uninitialized is confusing, and will 
> possibly lead to problems with valgrind as pointed out by Florian. How 
> about this:
>
> Symbol::Symbol(const u1* name, int length, int refcount) {
>   _hash_and_refcount = pack_hash_and_refcount((short)os::random(), 
> refcount);
>   _length = length;
>   // _body[0..1] are allocated in the header just by coincidence in 
> the current
>   // implementation of Symbol. They are read by identity_hash(), so 
> make sure they
>   // are initialized.
>   // No other code should assume that _body[0..1] are always 
> allocated. E.g., do
>   // not unconditionally read base()[0] as that will be invalid for an 
> empty Symbol.
>   _body[0] = _body[1] = 0;
>   memcpy(_body, name, length);
> }
>
> I'll also change the bug title to "Always initialize _body[0..1] in 
> Symbol constructor"

This looks good to me.  Thanks Ioi for changing it.

>
> ----
>
> As I discussed with Lois off-line:
>
> There's Signature code that unconditionally reads _body[0], which 
> would assert (but class loading checks for invalid signatures that 
> prevents this from happening)
>
> BasicType Signature::basic_type(const Symbol* signature) {
>   return basic_type(signature->char_at(0));
> }
>
> char Symbol::char_at(int index) const {
>   assert(index >=0 && index < length(), "symbol index overflow");
>   return (char)base()[index];
> }
>
> Signature::basic_type() should be fixed to either check for length, 
> and/or assert that signature is a valid signature.

I just created an RFE for this, 
https://bugs.openjdk.java.net/browse/JDK-8249931

Thanks,
Lois

>
> Thanks
> - Ioi
>
>
>
>



More information about the hotspot-runtime-dev mailing list