Avoid some GCC 10.X warnings in HotSpot

Ioi Lam ioi.lam at oracle.com
Thu Jun 18 05:51:54 UTC 2020



On 6/17/20 10:25 PM, David Holmes wrote:
> On 18/06/2020 3:04 pm, Ioi Lam wrote:
>>
>>
>> On 6/17/20 6:56 PM, David Holmes wrote:
>>> Hi Koichi,
>>>
>>> This is in relation to the hotspot part as these issues need to be 
>>> handled separately. I have filed:
>>>
>>> https://bugs.openjdk.java.net/browse/JDK-8247818
>>>
>>> On 18/06/2020 8:46 am, Koichi Sakata wrote:
>>>> Hi all,
>>>>
>>>> I tried to build OpenJDK fastdebug with GCC 10.1 on Ubuntu 18.04, 
>>>> but I saw some compiler warnings as follows:
>>>>
>>>> In file included from 
>>>> /home/jyukutyo/code/jdk/src/hotspot/share/classfile/systemDictionary.hpp:31, 
>>>>
>>>>                   from 
>>>> /home/jyukutyo/code/jdk/src/hotspot/share/classfile/javaClasses.hpp:28, 
>>>>
>>>>                   from 
>>>> /home/jyukutyo/code/jdk/src/hotspot/share/precompiled/precompiled.hpp:35: 
>>>>
>>>> In member function 'void Symbol::byte_at_put(int, u1)',
>>>>      inlined from 'Symbol::Symbol(const u1*, int, int)' at 
>>>> /home/jyukutyo/code/jdk/src/hotspot/share/oops/symbol.cpp:55:16:
>>>> /home/jyukutyo/code/jdk/src/hotspot/share/oops/symbol.hpp:130:18: 
>>>> error: writing 1 byte into a region of size 0 
>>>> [-Werror=stringop-overflow=]
>>>>    130 |     _body[index] = value;
>>>>        |     ~~~~~~~~~~~~~^~~~~~~
>>>
>>> I'm not really clear on the warning here but this is an area where 
>>> we trick the compiler somewhat. The _body[] is declared with a size 
>>> of 2, but when we allocate Symbols we allocate sufficient memory for 
>>> _body to contain the entire symbol.
>>>
>>> That said I'm struggling to see how we allocate the additional space 
>>> needed for the _hash_and_refcount and _length fields ???
>>>
>>>> I can resolve them with the following patch. I believe it fixes 
>>>> those potential bugs, so I'd like to contribute it.
>>>> (Our company has signed OCA.)
>>>>
>>>> Thanks,
>>>> Koichi
>>>>
>>>> ===== PATCH =====
>>>> diff -r 20d92fe3ac52 src/hotspot/share/oops/symbol.cpp
>>>> --- a/src/hotspot/share/oops/symbol.cpp    Tue Jun 16 03:16:41 2020 
>>>> +0000
>>>> +++ b/src/hotspot/share/oops/symbol.cpp    Thu Jun 18 07:08:50 2020 
>>>> +0900
>>>> @@ -50,9 +50,10 @@
>>>>   Symbol::Symbol(const u1* name, int length, int refcount) {
>>>>     _hash_and_refcount = 
>>>> pack_hash_and_refcount((short)os::random(), refcount);
>>>>     _length = length;
>>>> -  _body[0] = 0;  // in case length == 0
>>>> -  for (int i = 0; i < length; i++) {
>>>> -    byte_at_put(i, name[i]);
>>>> +  if (length == 0) {
>>>> +    _body[0] = 0;
>>>
>>> The check for length==0 introduces more overhead than just always 
>>> setting _body[0]=0, so there is no need to add it.
>>>
>>>> +  } else {
>>>> +    memcpy(_body, name, length);
>>>>     }
>>>>   }
>>>
>>> So you are replacing byte_at_put with a memcpy call. On the surface 
>>> that seems reasonable, but I have to wonder why we were using the 
>>> loop in the first place. It may just be historical or it may relate 
>>> to an alignment issue, or something else. Hopefully someone else 
>>> (e.g. Coleen :) ) can shed more light here.
>>>
>>
>> Actually I don't know why we are setting _body[0] to zero. The 
>> content of _body is not supposed to be 0-terminated so we don't need 
>> a zero in there.
>
> It may not be required but it is cleaner/safer to make it NUL. Though 
> arguably we should just be rejecting a length of zero to begin with.
>
zero-length symbols are valid. They are used for UTF8 strings in 
classfiles that represent the literal string "".

>> E.g., if the length is 1, _body[1] will have garbage anyway.
>
> If the length is 1 then _body[1] is "out of bounds", so I'm not sure 
> what your point is. If the actual length is one then _body[0] has the 
> true symbol content (non NUL-terminated) and _body[1] is junk we 
> shouldn't touch.
>

By the same token if length is 0, _body[0] is out of bounds, so we can 
also leave it as garbage. My point is, there's no advantage of making 
zero length a special case, where all other cases the string is not null 
terminated.

> But the definition and allocation of Symbols confuses me as I 
> mentioned because I can't see why we define _body with size 2 rather 
> than just 1. 
We define it as size 2 so the whole header is 8 bytes long. It doesn't 
really matter, but feels nicer,

   volatile uint32_t _hash_and_refcount;
   u2 _length;
   u1 _body[2];

... especially you see the same "2" in here:

   static int byte_size(int length) {
     // minimum number of natural words needed to hold these bits (no 
non-heap version)
     return (int)(sizeof(Symbol) + (length > 2 ? length - 2 : 0));
   }

> Nor can I see how the length we pass to placement new accounts for the 
> additional fields that a Symbol has. ??
>

The size of the Symbol, including additional space for the whole string, 
is calculated in here:

void* Symbol::operator new(size_t sz, int len, Arena* arena) throw() {
   int alloc_size = size(len)*wordSize; <<<<<<<<< HERE
   address res = (address)arena->Amalloc_4(alloc_size);
   return res;
}

Thanks
- Ioi

> Cheers,
> David
>
>> "java" hit Breakpoint 2, Symbol::Symbol (this=0x7fffa1621e38, 
>> name=0x7ffff6dceb9c "*", length=1, refcount=65535)
>> (step until the loop finishes)
>>
>> (gdb) p *this
>> $1 = {<MetaspaceObj> = {}, _hash_and_refcount = 2027159551, _length = 
>> 1, _body = <incomplete sequence \361>}
>> (gdb) p _body[0]
>> $2 = 42 '*'
>> (gdb) p _body[1]
>> $3 = 241 '\361'
>>
>> Thanks
>> - Ioi
>>
>>> Thanks,
>>> David
>>> -----
>>>
>>>> diff -r 20d92fe3ac52 src/hotspot/share/oops/symbol.hpp
>>>> --- a/src/hotspot/share/oops/symbol.hpp    Tue Jun 16 03:16:41 2020 
>>>> +0000
>>>> +++ b/src/hotspot/share/oops/symbol.hpp    Thu Jun 18 07:08:50 2020 
>>>> +0900
>>>> @@ -125,11 +125,6 @@
>>>>       return (int)heap_word_size(byte_size(length));
>>>>     }
>>>>
>>>> -  void byte_at_put(int index, u1 value) {
>>>> -    assert(index >=0 && index < length(), "symbol index overflow");
>>>> -    _body[index] = value;
>>>> -  }
>>>> -
>>>>     Symbol(const u1* name, int length, int refcount);
>>>>     void* operator new(size_t size, int len) throw();
>>>>     void* operator new(size_t size, int len, Arena* arena) throw();
>>>
>>>
>>



More information about the hotspot-runtime-dev mailing list