RFD: HotSpot CONSTANT_REGISTER_DECLARATION and extern const
Andrew Haley
aph at redhat.com
Mon Jun 21 17:01:10 UTC 2021
I've been looking at the quality of the code GCC generates for
HotSpot's assembler, and the code now is way suboptimal. There's a
number of reasons for that, but one of the most important is the way
that Registers are defined.
TL/DR: register definitions in HotSpot are declared as "extern const"
for ancient-historical reasons. We should stop doing that: it would
make the assembler significantly faster and smaller, improving both
bootstrap time and compilation speed.
register.hpp contains these definitions:
#define CONSTANT_REGISTER_DECLARATION(type, name, value) \
extern const type name; \
enum { name##_##type##EnumValue = (value) };
#define REGISTER_DEFINITION(type, name) \
const type name = ((type)name##_##type##EnumValue)
a register declaration like this
CONSTANT_REGISTER_DECLARATION(Register, r0, (0));
expands to this in the header file:
extern const Register r0; \
enum { r0_RegisterEnumValue = ((0)) };;
and this in the register_aarch64.cpp file:
const Register r0 = ((Register)r0_RegisterEnumValue);
So, the constants which define each register appear only in one
compilation unit (register_aarch64.o) and every user loads the
constants from there.
The result (at least on AArch64) is tragic. Every reference to r0 has
to do a load from the slot in memory that contains the constant r0,
and what's worse all of those loads have to go via the GOT.
So the simple constant r0 requires two loads from memory, once to get
the GOT entry and once to load the constant. This results in GCC
generating awful code, 142 instructions (!) to generate
add(r0, r0, zr);
(A lot of this generated code is because on AArch64 we check that every
operand is in range, even in product builds. This has saved our
backsides several times now, so it's not something I want to change.)
If I change the way register definitions are done to the obvious
#define CONSTANT_REGISTER_DECLARATION(type, name, value) \
const type name = ((type)value)
#define REGISTER_DEFINITION(type, name)
I get this for add(r0, r0, zr) :
ldr x1, [x0, #8]
mov w2, #0x8b1f0000
ldr x0, [x1, #16]
str w2, [x0], #4
str x0, [x1, #16]
ret
which is near-enough optimal. GCC can do constant- and copy-
propagation to remove all the range checks for the operands.
A comment in register.hpp explains why all this "extern const" stuff
was done:
// We'd like to be able to simply define const instances of the
// RegisterImpl* for each of the registers needed on a system in a
// header file. However many compilers don't handle this very well
// and end up producing a private definition in every file which
// includes the header file. Along with the static constructors
// necessary for initialization it can consume a significant amount of
// space in the result library.
I believe that this is probably ancient history, way back when in
early HotSpot days, and we probably don't need to do it any more. I
could simply fix this in the AArch64 back end after checking that it
works on Windows, which it almost certainly does: it's fine on Linux
and MacOS, and we don't care about ancient compilers any more because
we've moved to C++14.
Getting rid of "extern const" is better on x86 too, although the
improvement isn't as dramatic as on AArch64, reducing 20 instructions
to generate
addl(rdx, 0);
to 8:
mov 0x8(%rdi),%rdx
mov $0xffffc283,%ecx
mov 0x10(%rdx),%rax
mov %cx,(%rax)
add $0x3,%rax
movb $0x0,-0x1(%rax)
mov %rax,0x10(%rdx)
retq
So, can we just stop using "extern const"? At least on Linux?
Thanks for reading this far,
--
Andrew Haley (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671
More information about the hotspot-compiler-dev
mailing list