RFR: 8224851: AArch64: fix warnings and errors with Clang and GCC 8.3

Fri May 31 04:11:44 UTC 2019

Hi Andrew,

>
> Oh dear. Experience has shown me (and probably you, too) that this is
> one of the most error-prone parts of software development. Many
> very serious failures have been caused by trying to shut up
> compilers. These failures have even included major security breaches.
>

Fair enough. But I think several of these are worth fixing.

>
>> /home/nicgas01/jdk/src/hotspot/os_cpu/linux_aarch64/atomic_linux_aarch64.hpp:43:39: error: cannot initialize a parameter of type 'char *' with an lvalue of type 'unsigned long'
>>     return __sync_add_and_fetch(dest, add_value);
>>                                       ^~~~~~~~~
>>
>> If the add_and_fetch template is instantiated with I=integer and
>> D=pointer then we need an explicit cast here.
>
> Please don't. Please have a look at where this happens and fix the
> types there, as appropriate. What do other ports do?

It's instantiated here:

/home/nicgas01/jdk/src/hotspot/share/gc/g1/g1PageBasedVirtualSpace.cpp:259:34: 
note: in instantiation of function template specialization 
'Atomic::add<unsigned long, char *>' requested here
       char* touch_addr = Atomic::add(actual_chunk_size, &_cur_addr) - 
actual_chunk_size;
                                  ^

We're adding an unsigned long to a char* which seems legitimate.

The other ports except Arm32 and Zero use inline assembly so don't have
problems with mismatched types. Zero uses __sync_fetch_and_add so would
hit this too if it used G1.

I'm not sure how to resolve this. We could add a specialisation of
PlatformAdd for byte_size=8 and cast to (u)intptr_t?

template<>
struct Atomic::PlatformAdd<8>
   : Atomic::AddAndFetch<Atomic::PlatformAdd<8> >
{
   template<typename I, typename D>
   D add_and_fetch(I add_value, D volatile* dest, atomic_memory_order 
order) const {
     STATIC_ASSERT(8 == sizeof(I));
     STATIC_ASSERT(8 == sizeof(D));

     typedef typename Conditional<IsSigned<I>::value, intptr_t, 
uintptr_t>::type PI;

     PI res = __sync_add_and_fetch(reinterpret_cast<volatile PI *>(dest),
                                   static_cast<PI>(add_value));
     return reinterpret_cast<D>(res);
   }
}

I think this is safe.

>
>> Rewrote as ~((1<<12) - 1) which matches the masks used in the rest of
>> the function.
>
> Let's not. All we have to do to create a mask is is use -1u<<12. I
> would challenge anyone to figure out without using a pencil what the
> effect of  &= ~(((1<<12)-1)<<12)  might be.

OK, I don't mind too much either way.

>
>> /home/nicgas01/jdk/src/hotspot/os_cpu/linux_aarch64/os_linux_aarch64.cpp:103:20: error: variable 'esp' is uninitialized when used here [-Werror,-Wuninitialized]
>>   return (address) esp;
>>                    ^~~
>> /home/nicgas01/jdk/src/hotspot/os_cpu/linux_aarch64/os_linux_aarch64.cpp:101:21: note: initialize the variable 'esp' to silence this warning
>>   register void *esp __asm__ (SPELL_REG_SP);
>>                     ^
>>                      = nullptr
>>
>> We could silence the -Wuninitialized warning for this function, but that
>> doesn't help much because Clang then compiles
>> os::current_stack_pointer() into:
>>
>> 0000000000000000 <_ZN2os21current_stack_pointerEv>:
>>        0:	d65f03c0 	ret
>>
>> Whereas with GCC we get:
>>
>> 0000000000000250 <_ZN2os21current_stack_pointerEv>:
>>      250:	910003e0 	mov	x0, sp
>>      254:	d65f03c0 	ret
>>
>> Added some inline assembly to explicitly move sp into the result #ifdef
>> __clang__. Same with _get_previous_fp().
>
> This is all looking extremely fragile. We'd need to look at what the
> usages of this code are and trace through.

Yes, I agree this function seems quite fragile. What is the "current"
stack pointer supposed to be? Is it the SP at the call site? That's what
you get with current code on GCC -00 and -02, and with this patch on
clang -O2. But clang -O0 gives SP - 16 because it creates a stack frame
for os::current_stack_pointer().

On x86_64 with gcc -00 and clang -00 we get this:

0000000000000000 <_ZN2os21current_stack_pointerEv>:
     0:       55                      push   %rbp
     1:       48 89 e5                mov    %rsp,%rbp
     4:       48 89 e0                mov    %rsp,%rax
     7:       5d                      pop    %rbp
     8:       c3                      retq

The result is RSP before the CALL instruction - 16. On -02 we get
caller's RSP - 8 instead:

0000000000000000 <_ZN2os21current_stack_pointerEv>:
    0:        48 89 e0                mov    %rsp,%rax
    3:        c3                      retq

I'm also not sure our existing use of `__asm__ (SPELL_REG_SP)' is
correct. The GCC manual [1] says "The only supported use for this
feature is to specify registers for input and output operands when
calling Extended asm" which we are not doing here.

Excluding assertions and tests, the usages are in os::current_frame()
and as the input to os::stack_shadow_pages_available(). In the first
case the SP seems to already be inaccurate as it points into the frame
of os::current_frame() but the FP used is for the caller. In the second
case I think there is only a problem if the value returned by
os::current_stack_pointer() is *above* the actual SP, as then we would
believe there was sufficient stack space when there really wasn't.

So if it's OK to return a value slightly below the caller's SP maybe we
can use __builtin_frame_address(0)? Or perhaps
__builtin_frame_address(0) + 16. We could also replace the single usage
of _get_previous_fp() with __builtin_frame_address(1) and avoid trying
to guess whether it will be inlined or not.

Thanks,
Nick

[1] https://gcc.gnu.org/onlinedocs/gcc/Local-Register-Variables.html