RFR: 8263718: unused-result warning happens at os_linux.cpp

David Holmes david.holmes at oracle.com
Fri Mar 19 22:24:08 UTC 2021


On 19/03/2021 7:06 pm, Yasumasa Suenaga wrote:
> On Fri, 19 Mar 2021 08:47:39 GMT, Yasumasa Suenaga <ysuenaga at openjdk.org> wrote:
> 
>>> I checked `alloca()` call on Fedora 33 x86_64 and Alpine 3.13 x86_64. Both of them seem to elide `alloca()`.
>>> (It is fastdebug build - slow debug might be different, but production build might be same)
>>>
>>> 659       thread->record_stack_base_and_size();
>>>     0x00007ffff7154d44 <+20>:    call   0x7ffff75b3a20 <_ZN6Thread26record_stack_base_and_sizeEv>
>>>
>>> 660
>>> 661       // Try to randomize the cache line index of hot stack frames.
>>> 662       // This helps when threads of the same stack traces evict each other's
>>> 663       // cache lines. The threads can be either from the same JVM instance, or
>>> 664       // from different JVM instances. The benefit is especially true for
>>> 665       // processors with hyperthreading technology.
>>> 666       static int counter = 0;
>>>
>>> 667       int pid = os::current_process_id();
>>>
>>> 668       alloca(((pid ^ counter++) & 7) * 128);
>>>     0x00007ffff7154d51 <+33>:    addl   $0x1,0xc1db10(%rip)        # 0x7ffff7d72868 <_ZZL19thread_native_entryP6ThreadE7counter>
>>>
>>> 669
>>> 670       thread->initialize_thread_current();
>>> => 0x00007ffff7154d4e <+30>:    mov    %r13,%rdi
>>>     0x00007ffff7154d58 <+40>:    call   0x7ffff75b3800 <_ZN6Thread25initialize_thread_currentEv>
>>>
>>>> I prefer it rather than (void)!.
>>>>
>>>> Does that work in release builds too?
>>>
>>> It will not work as David said :) we need to use `(void)!` if we should left it.
>>
>>> + static void* volatile _stack_pad = alloca(((pid ^ counter++) & 7) * 128);
>>> + if (_stack_pad != 0) {
>>> + ((char*)_stack_pad)[0] = 1;
>>> + }
>>
>> I guess `_stack_pad` will be overwritten in each `threaad_native_entry()` call, so it might be elided.
>> I modified the code as following, it seems to work - we cannot see `alloca()`, however the stack is expanded.
>>
>> diff --git a/src/hotspot/os/linux/os_linux.cpp b/src/hotspot/os/linux/os_linux.cpp
>> index 5af63befb58..bdb2dc89615 100644
>> --- a/src/hotspot/os/linux/os_linux.cpp
>> +++ b/src/hotspot/os/linux/os_linux.cpp
>> @@ -665,7 +665,8 @@ static void *thread_native_entry(Thread *thread) {
>>     // processors with hyperthreading technology.
>>     static int counter = 0;
>>     int pid = os::current_process_id();
>> -  alloca(((pid ^ counter++) & 7) * 128);
>> +  void *ptr = alloca(((pid ^ counter++) & 7) * 128);
>> +  ((char *)ptr)[0] = 1;
>>
>>     thread->initialize_thread_current();
>>
>> 659       thread->record_stack_base_and_size();
>>     0x00007ffff7154d53 <+35>:    call   0x7ffff75b3a80 <_ZN6Thread26record_stack_base_and_sizeEv>
>>
>> 660
>> 661       // Try to randomize the cache line index of hot stack frames.
>> 662       // This helps when threads of the same stack traces evict each other's
>> 663       // cache lines. The threads can be either from the same JVM instance, or
>> 664       // from different JVM instances. The benefit is especially true for
>> 665       // processors with hyperthreading technology.
>> 666       static int counter = 0;
>>
>> 667       int pid = os::current_process_id();
>>
>> 668       void *ptr = alloca(((pid ^ counter++) & 7) * 128);
>>     0x00007ffff7154d63 <+51>:    mov    0xc1daff(%rip),%eax        # 0x7ffff7d72868 <_ZZL19thread_native_entryP6ThreadE7counter>
>>     0x00007ffff7154d69 <+57>:    lea    0x1(%rax),%edx
>>     0x00007ffff7154d6c <+60>:    xor    %r8d,%eax
>>     0x00007ffff7154d6f <+63>:    shl    $0x7,%rax
>>     0x00007ffff7154d73 <+67>:    mov    %edx,0xc1daef(%rip)        # 0x7ffff7d72868 <_ZZL19thread_native_entryP6ThreadE7counter>
>>     0x00007ffff7154d79 <+73>:    and    $0x380,%eax
>>     0x00007ffff7154d7e <+78>:    add    $0x17,%rax
>>     0x00007ffff7154d82 <+82>:    and    $0x7f0,%eax
>>     0x00007ffff7154d87 <+87>:    sub    %rax,%rsp
>>     0x00007ffff7154d8a <+90>:    lea    0xf(%rsp),%rax
>>     0x00007ffff7154d8f <+95>:    and    $0xfffffffffffffff0,%rax
>>
>> 669       ((char *)ptr)[0] = 1;
>>     0x00007ffff7154d93 <+99>:    movb   $0x1,(%rax)
> 
>>> I modified the code as following, it seems to work - we cannot see `alloca()`, however the stack is expanded.
>>
>> Sorry but I'm not seeing where the stack actually gets expanded?
> 
> 0x00007ffff7154d87 <+87>:    sub    %rax,%rsp

Doh! Thanks.

I'm re-running some benchmarks on Linux.

It would be good to confirm that the alloca is also being elided with 
clang and VS.

David
-----

> I guess `%rax` seems to contain the result of `((pid ^ counter++) & 7) * 128`, then `alloca()` is replaced to `sub` for `%RSP`.
> I saw the warning for this issue as `void* __builtin_alloca(long unsigned int)`. It might be it. We can just expand `%RSP` if we want to allocate buffer on the stack.
> 
> -------------
> 
> PR: https://git.openjdk.java.net/jdk/pull/3042
> 


More information about the hotspot-dev mailing list