RFR: 8223089: Stack alignment for x86-32

Andrew Haley aph at redhat.com
Tue Apr 30 14:42:47 UTC 2019


We've been seeing segfaults on 32-bit Linux x86.

Recent Linux distributions' runtime libraries are compiled with SSE
enabled; this means that the stack must be aligned on a 16-bit
boundary when a function is called. GCC has defaulted to
16-bit-aligned code for many years but HotSpot does not, calling
runtime routines with a misaligned stack.

There is some code in HotSpot to work around specific instances of
this problem, but it is not applied consistently. If runtime code
calls out to C library functions, the stack remains misaligned and a
segfault can result, We can work around this by compiling the HotSpot
runtime with -mrealign-stack but this causes all code generated by GCC
to realign the stack, which is not efficient. It also prevents us from
compiling HotSpot with SSE enabled.

I tried a variety of solutions, including rewriting the code which
does runtime calls. Unfortunately, there isn't a common point from
which all runtime calls are made. Instead, there are many places, with
different ways of passing arguments.

At this stage in the lifetime of 32-bit x86 I don't think we can
justify either the initial cost or the maintanance cost of rewriting
all of the runtime call code. Instead, what I've done at the point of
a call from HotSpot-generated to native code is create a new (aligned)
stack frame, copy outgoing args into it, and call the native code.

Old Style:

  __ call(RuntimeAddress(target));
  OopMapSet* oop_maps = new OopMapSet();
  oop_maps->add_gc_map(__ offset(), oop_map);

New Style:

  {
    AlignStackWithArgs aligned(sasm, num_rt_args, target);
    __ call(RuntimeAddress(target));
    OopMapSet* oop_maps = new OopMapSet();
    oop_maps->add_gc_map(__ offset(), oop_map);
  }

C2 isn't affected: it gets everything right already.
C1 is affected, and so is the interpreter.

Re-alignment only occurs when we know we are calling external code: we
can tell that because the target is outside the code cache. So, we do
take a performance hit from copying the args, but only at the
transition.

-- 
Andrew Haley
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


More information about the hotspot-dev mailing list