RFR: 8263718: unused-result warning happens at os_linux.cpp
David Holmes
david.holmes at oracle.com
Fri Mar 19 22:24:08 UTC 2021
On 19/03/2021 7:06 pm, Yasumasa Suenaga wrote:
> On Fri, 19 Mar 2021 08:47:39 GMT, Yasumasa Suenaga <ysuenaga at openjdk.org> wrote:
>
>>> I checked `alloca()` call on Fedora 33 x86_64 and Alpine 3.13 x86_64. Both of them seem to elide `alloca()`.
>>> (It is fastdebug build - slow debug might be different, but production build might be same)
>>>
>>> 659 thread->record_stack_base_and_size();
>>> 0x00007ffff7154d44 <+20>: call 0x7ffff75b3a20 <_ZN6Thread26record_stack_base_and_sizeEv>
>>>
>>> 660
>>> 661 // Try to randomize the cache line index of hot stack frames.
>>> 662 // This helps when threads of the same stack traces evict each other's
>>> 663 // cache lines. The threads can be either from the same JVM instance, or
>>> 664 // from different JVM instances. The benefit is especially true for
>>> 665 // processors with hyperthreading technology.
>>> 666 static int counter = 0;
>>>
>>> 667 int pid = os::current_process_id();
>>>
>>> 668 alloca(((pid ^ counter++) & 7) * 128);
>>> 0x00007ffff7154d51 <+33>: addl $0x1,0xc1db10(%rip) # 0x7ffff7d72868 <_ZZL19thread_native_entryP6ThreadE7counter>
>>>
>>> 669
>>> 670 thread->initialize_thread_current();
>>> => 0x00007ffff7154d4e <+30>: mov %r13,%rdi
>>> 0x00007ffff7154d58 <+40>: call 0x7ffff75b3800 <_ZN6Thread25initialize_thread_currentEv>
>>>
>>>> I prefer it rather than (void)!.
>>>>
>>>> Does that work in release builds too?
>>>
>>> It will not work as David said :) we need to use `(void)!` if we should left it.
>>
>>> + static void* volatile _stack_pad = alloca(((pid ^ counter++) & 7) * 128);
>>> + if (_stack_pad != 0) {
>>> + ((char*)_stack_pad)[0] = 1;
>>> + }
>>
>> I guess `_stack_pad` will be overwritten in each `threaad_native_entry()` call, so it might be elided.
>> I modified the code as following, it seems to work - we cannot see `alloca()`, however the stack is expanded.
>>
>> diff --git a/src/hotspot/os/linux/os_linux.cpp b/src/hotspot/os/linux/os_linux.cpp
>> index 5af63befb58..bdb2dc89615 100644
>> --- a/src/hotspot/os/linux/os_linux.cpp
>> +++ b/src/hotspot/os/linux/os_linux.cpp
>> @@ -665,7 +665,8 @@ static void *thread_native_entry(Thread *thread) {
>> // processors with hyperthreading technology.
>> static int counter = 0;
>> int pid = os::current_process_id();
>> - alloca(((pid ^ counter++) & 7) * 128);
>> + void *ptr = alloca(((pid ^ counter++) & 7) * 128);
>> + ((char *)ptr)[0] = 1;
>>
>> thread->initialize_thread_current();
>>
>> 659 thread->record_stack_base_and_size();
>> 0x00007ffff7154d53 <+35>: call 0x7ffff75b3a80 <_ZN6Thread26record_stack_base_and_sizeEv>
>>
>> 660
>> 661 // Try to randomize the cache line index of hot stack frames.
>> 662 // This helps when threads of the same stack traces evict each other's
>> 663 // cache lines. The threads can be either from the same JVM instance, or
>> 664 // from different JVM instances. The benefit is especially true for
>> 665 // processors with hyperthreading technology.
>> 666 static int counter = 0;
>>
>> 667 int pid = os::current_process_id();
>>
>> 668 void *ptr = alloca(((pid ^ counter++) & 7) * 128);
>> 0x00007ffff7154d63 <+51>: mov 0xc1daff(%rip),%eax # 0x7ffff7d72868 <_ZZL19thread_native_entryP6ThreadE7counter>
>> 0x00007ffff7154d69 <+57>: lea 0x1(%rax),%edx
>> 0x00007ffff7154d6c <+60>: xor %r8d,%eax
>> 0x00007ffff7154d6f <+63>: shl $0x7,%rax
>> 0x00007ffff7154d73 <+67>: mov %edx,0xc1daef(%rip) # 0x7ffff7d72868 <_ZZL19thread_native_entryP6ThreadE7counter>
>> 0x00007ffff7154d79 <+73>: and $0x380,%eax
>> 0x00007ffff7154d7e <+78>: add $0x17,%rax
>> 0x00007ffff7154d82 <+82>: and $0x7f0,%eax
>> 0x00007ffff7154d87 <+87>: sub %rax,%rsp
>> 0x00007ffff7154d8a <+90>: lea 0xf(%rsp),%rax
>> 0x00007ffff7154d8f <+95>: and $0xfffffffffffffff0,%rax
>>
>> 669 ((char *)ptr)[0] = 1;
>> 0x00007ffff7154d93 <+99>: movb $0x1,(%rax)
>
>>> I modified the code as following, it seems to work - we cannot see `alloca()`, however the stack is expanded.
>>
>> Sorry but I'm not seeing where the stack actually gets expanded?
>
> 0x00007ffff7154d87 <+87>: sub %rax,%rsp
Doh! Thanks.
I'm re-running some benchmarks on Linux.
It would be good to confirm that the alloca is also being elided with
clang and VS.
David
-----
> I guess `%rax` seems to contain the result of `((pid ^ counter++) & 7) * 128`, then `alloca()` is replaced to `sub` for `%RSP`.
> I saw the warning for this issue as `void* __builtin_alloca(long unsigned int)`. It might be it. We can just expand `%RSP` if we want to allocate buffer on the stack.
>
> -------------
>
> PR: https://git.openjdk.java.net/jdk/pull/3042
>
More information about the hotspot-dev
mailing list