Avoid some GCC 10.X warnings in HotSpot

David Holmes david.holmes at oracle.com
Thu Jun 18 08:38:17 UTC 2020


On 18/06/2020 3:51 pm, Ioi Lam wrote:
> On 6/17/20 10:25 PM, David Holmes wrote:
>> On 18/06/2020 3:04 pm, Ioi Lam wrote:
>>>
>>>
>>> On 6/17/20 6:56 PM, David Holmes wrote:
>>>> Hi Koichi,
>>>>
>>>> This is in relation to the hotspot part as these issues need to be 
>>>> handled separately. I have filed:
>>>>
>>>> https://bugs.openjdk.java.net/browse/JDK-8247818
>>>>
>>>> On 18/06/2020 8:46 am, Koichi Sakata wrote:
>>>>> Hi all,
>>>>>
>>>>> I tried to build OpenJDK fastdebug with GCC 10.1 on Ubuntu 18.04, 
>>>>> but I saw some compiler warnings as follows:
>>>>>
>>>>> In file included from 
>>>>> /home/jyukutyo/code/jdk/src/hotspot/share/classfile/systemDictionary.hpp:31, 
>>>>>
>>>>>                   from 
>>>>> /home/jyukutyo/code/jdk/src/hotspot/share/classfile/javaClasses.hpp:28, 
>>>>>
>>>>>                   from 
>>>>> /home/jyukutyo/code/jdk/src/hotspot/share/precompiled/precompiled.hpp:35: 
>>>>>
>>>>> In member function 'void Symbol::byte_at_put(int, u1)',
>>>>>      inlined from 'Symbol::Symbol(const u1*, int, int)' at 
>>>>> /home/jyukutyo/code/jdk/src/hotspot/share/oops/symbol.cpp:55:16:
>>>>> /home/jyukutyo/code/jdk/src/hotspot/share/oops/symbol.hpp:130:18: 
>>>>> error: writing 1 byte into a region of size 0 
>>>>> [-Werror=stringop-overflow=]
>>>>>    130 |     _body[index] = value;
>>>>>        |     ~~~~~~~~~~~~~^~~~~~~
>>>>
>>>> I'm not really clear on the warning here but this is an area where 
>>>> we trick the compiler somewhat. The _body[] is declared with a size 
>>>> of 2, but when we allocate Symbols we allocate sufficient memory for 
>>>> _body to contain the entire symbol.
>>>>
>>>> That said I'm struggling to see how we allocate the additional space 
>>>> needed for the _hash_and_refcount and _length fields ???
>>>>
>>>>> I can resolve them with the following patch. I believe it fixes 
>>>>> those potential bugs, so I'd like to contribute it.
>>>>> (Our company has signed OCA.)
>>>>>
>>>>> Thanks,
>>>>> Koichi
>>>>>
>>>>> ===== PATCH =====
>>>>> diff -r 20d92fe3ac52 src/hotspot/share/oops/symbol.cpp
>>>>> --- a/src/hotspot/share/oops/symbol.cpp    Tue Jun 16 03:16:41 2020 
>>>>> +0000
>>>>> +++ b/src/hotspot/share/oops/symbol.cpp    Thu Jun 18 07:08:50 2020 
>>>>> +0900
>>>>> @@ -50,9 +50,10 @@
>>>>>   Symbol::Symbol(const u1* name, int length, int refcount) {
>>>>>     _hash_and_refcount = 
>>>>> pack_hash_and_refcount((short)os::random(), refcount);
>>>>>     _length = length;
>>>>> -  _body[0] = 0;  // in case length == 0
>>>>> -  for (int i = 0; i < length; i++) {
>>>>> -    byte_at_put(i, name[i]);
>>>>> +  if (length == 0) {
>>>>> +    _body[0] = 0;
>>>>
>>>> The check for length==0 introduces more overhead than just always 
>>>> setting _body[0]=0, so there is no need to add it.
>>>>
>>>>> +  } else {
>>>>> +    memcpy(_body, name, length);
>>>>>     }
>>>>>   }
>>>>
>>>> So you are replacing byte_at_put with a memcpy call. On the surface 
>>>> that seems reasonable, but I have to wonder why we were using the 
>>>> loop in the first place. It may just be historical or it may relate 
>>>> to an alignment issue, or something else. Hopefully someone else 
>>>> (e.g. Coleen :) ) can shed more light here.
>>>>
>>>
>>> Actually I don't know why we are setting _body[0] to zero. The 
>>> content of _body is not supposed to be 0-terminated so we don't need 
>>> a zero in there.
>>
>> It may not be required but it is cleaner/safer to make it NUL. Though 
>> arguably we should just be rejecting a length of zero to begin with.
>>
> zero-length symbols are valid. They are used for UTF8 strings in 
> classfiles that represent the literal string "".
> 
>>> E.g., if the length is 1, _body[1] will have garbage anyway.
>>
>> If the length is 1 then _body[1] is "out of bounds", so I'm not sure 
>> what your point is. If the actual length is one then _body[0] has the 
>> true symbol content (non NUL-terminated) and _body[1] is junk we 
>> shouldn't touch.
>>
> 
> By the same token if length is 0, _body[0] is out of bounds, so we can 
> also leave it as garbage. My point is, there's no advantage of making 
> zero length a special case, where all other cases the string is not null 
> terminated.

Okay.

>> But the definition and allocation of Symbols confuses me as I 
>> mentioned because I can't see why we define _body with size 2 rather 
>> than just 1. 
> We define it as size 2 so the whole header is 8 bytes long. It doesn't 
> really matter, but feels nicer,
> 
>    volatile uint32_t _hash_and_refcount;
>    u2 _length;
>    u1 _body[2];
> 
> ... especially you see the same "2" in here:
> 
>    static int byte_size(int length) {
>      // minimum number of natural words needed to hold these bits (no 
> non-heap version)
>      return (int)(sizeof(Symbol) + (length > 2 ? length - 2 : 0));
>    }
> 
>> Nor can I see how the length we pass to placement new accounts for the 
>> additional fields that a Symbol has. ??
>>
> 
> The size of the Symbol, including additional space for the whole string, 
> is calculated in here:
> 
> void* Symbol::operator new(size_t sz, int len, Arena* arena) throw() {
>    int alloc_size = size(len)*wordSize; <<<<<<<<< HERE

Ah! I misread that as a cast.

Thanks,
David

>    address res = (address)arena->Amalloc_4(alloc_size);
>    return res;
> }
> 
> Thanks
> - Ioi
> 
>> Cheers,
>> David
>>
>>> "java" hit Breakpoint 2, Symbol::Symbol (this=0x7fffa1621e38, 
>>> name=0x7ffff6dceb9c "*", length=1, refcount=65535)
>>> (step until the loop finishes)
>>>
>>> (gdb) p *this
>>> $1 = {<MetaspaceObj> = {}, _hash_and_refcount = 2027159551, _length = 
>>> 1, _body = <incomplete sequence \361>}
>>> (gdb) p _body[0]
>>> $2 = 42 '*'
>>> (gdb) p _body[1]
>>> $3 = 241 '\361'
>>>
>>> Thanks
>>> - Ioi
>>>
>>>> Thanks,
>>>> David
>>>> -----
>>>>
>>>>> diff -r 20d92fe3ac52 src/hotspot/share/oops/symbol.hpp
>>>>> --- a/src/hotspot/share/oops/symbol.hpp    Tue Jun 16 03:16:41 2020 
>>>>> +0000
>>>>> +++ b/src/hotspot/share/oops/symbol.hpp    Thu Jun 18 07:08:50 2020 
>>>>> +0900
>>>>> @@ -125,11 +125,6 @@
>>>>>       return (int)heap_word_size(byte_size(length));
>>>>>     }
>>>>>
>>>>> -  void byte_at_put(int index, u1 value) {
>>>>> -    assert(index >=0 && index < length(), "symbol index overflow");
>>>>> -    _body[index] = value;
>>>>> -  }
>>>>> -
>>>>>     Symbol(const u1* name, int length, int refcount);
>>>>>     void* operator new(size_t size, int len) throw();
>>>>>     void* operator new(size_t size, int len, Arena* arena) throw();
>>>>
>>>>
>>>
> 


More information about the hotspot-runtime-dev mailing list