[Bug 3533] [IcedTea8] HotSpot generates code with unaligned stack, crashes on SSE operations
bugzilla-daemon at icedtea.classpath.org
bugzilla-daemon at icedtea.classpath.org
Thu May 17 20:56:31 UTC 2018
https://icedtea.classpath.org/bugzilla/show_bug.cgi?id=3533
--- Comment #23 from Maciej S. Szmigiero <mail at maciej.szmigiero.name> ---
(In reply to Andrew John Hughes from comment #22)
> (In reply to Maciej S. Szmigiero from comment #21)
> > I've tried "-mstackrealign" but it didn't add the required
> > stack pointer realignment prologue to every entry function.
> >
>
> I've been able to do a full bootstrap with -mstackrealign on a setup that
> would reliably crash without that option. The fact that Oracle added it for
> the same issue on Mac OS X suggests that it's the right workaround and
> that's also the opinion of other engineers I've spoken to.
>
> What issues do you still see with this enabled?
Let's consider a simple test file (test.c):
> int external(int p1, int p2, int p3, int p4);
>
> int internal(int p1, int p2, int p3, int p4)
> {
> return external(p1, p2, p3, p4);
> }
When compiled with "gcc -m32 -O2 -mstackrealign -fno-stack-protector -fno-pic"
the following assembler output is generated:
> .globl internal
> .type internal, @function
> internal:
> .LFB0:
> .cfi_startproc
> jmp external
> .cfi_endproc
> .LFE0:
> .size internal, .-internal
You can see that no stack realigning code is present.
Now think that the "internal" function is a part of the JVM and it
gets called by HotSpot-generated code with a misaligned stack.
If the "external" function was located in an external library (like glibc or
zlib) that relied on stack being properly aligned there would be a crash inside
it (as it was previously observed).
However, when the same code is compiled with
"gcc -m32 -mincoming-stack-boundary=2 -O2 -Wall -fno-stack-protector -fno-pic"
the following assembler output is generated instead:
> .globl internal
> .type internal, @function
> internal:
> .LFB0:
> .cfi_startproc
> leal 4(%esp), %ecx
> .cfi_def_cfa 1, 0
> andl $-16, %esp
> pushl -4(%ecx)
> pushl %ebp
> .cfi_escape 0x10,0x5,0x2,0x75,0
> movl %esp, %ebp
> pushl %ecx
> .cfi_escape 0xf,0x3,0x75,0x7c,0x6
> subl $4116, %esp
> orl $0, (%esp)
> addl $4112, %esp
> pushl 12(%ecx)
> pushl 8(%ecx)
> pushl 4(%ecx)
> pushl (%ecx)
> call external
> movl -4(%ebp), %ecx
> .cfi_def_cfa 1, 0
> addl $16, %esp
> leave
> .cfi_restore 5
> leal -4(%ecx), %esp
> .cfi_def_cfa 4, 4
> ret
> .cfi_endproc
> .LFE0:
> .size internal, .-internal
You can see that now the code includes a stack realignment instruction
("andl $-16, %esp").
The "external" function won't crash in this case since the stack it receives
is guaranteed to be nicely aligned.
The tests above were done on gcc version 7.3.0 ("Gentoo Hardened 7.3.0-r1
p1.1").
Removing "-fno-stack-protector -fno-pic" doesn't really change the test result,
only clutters the generated code with indirect references via GOT and a stack
canary check.
>
> > From the Gentoo bug:
> > > That's why an other solution is necessary in form of
> > > "-mincoming-stack-boundary=2" {C,CXX}FLAG for icedtea (not
> > > "-mstackrealign" since it does not work in our case as this flag seems
> > > to be for leaf - and not intermediate - functions).
> > > It is a big hammer but it allows building of a working icedtea (with
> > > glibc compiled by gcc-7.3.0) without any patching of the HotSpot code.
> >
> > Gentoo now adds "-mincoming-stack-boundary=2" when building IcedTea for
> > 32-bit x86 target.
>
> I'm dubious about this, especially as 2 means 4-byte alignment, which is a
> reduction from the default 16-byte alignment and could cause issues with
> some values.
>
> "
> -mincoming-stack-boundary=num
> Assume the incoming stack is aligned to a 2 raised to num byte
> boundary. If -mincoming-stack-boundary is not specified, the one specified
> by -mpreferred-stack-boundary is used. On Pentium and Pentium Pro, "double"
> and "long double" values should be aligned to an 8-byte boundary (see
> -malign-double) or suffer significant run time performance penalties. On
> Pentium III, the Streaming SIMD Extension (SSE) data type "__m128" may not
> work properly if it is not 16-byte aligned."
>
In our case the problem is that the incoming stack to gcc-compiled C/C++ code
might be aligned to 4 bytes instead of ABI-required 16 bytes so the
"-mincoming-stack-boundary=2" parameter is appropriate.
The gcc documentation might suggest that one needs both
"-mincoming-stack-boundary=2" to tell gcc about the incoming stack alignement
and "-mstackrealign" to actually realign it.
But as the tests above show using "-mincoming-stack-boundary=2" alone works but
using "-mstackrealign" alone does not.
BTW, the result of using both these parameters at the same time seems to be
exactly the same as using only "-mincoming-stack-boundary=2".
--
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/distro-pkg-dev/attachments/20180517/43e910b1/attachment.html>
More information about the distro-pkg-dev
mailing list