RFR: 8287373: remove unnecessary paddings in generated code
Boris Ulasevich
bulasevich at openjdk.java.net
Tue Jun 7 20:44:32 UTC 2022
On Tue, 7 Jun 2022 16:11:59 GMT, Vladimir Kozlov <kvn at openjdk.org> wrote:
>> The goal is to remove unnecessary paddings in generated code. The alignment of the [Stub Code] section is determined by the same value as the alignment of the [Entry Point] section: the CodeEntryAlignment parameter with default values 64B on AARCH, and 32B on AMD.
>>
>> Large entry alignment values are questionable for entry section. For example, Arm Neoverse N1 Software Optimization Guide recommends to align subroutines to 32B, while static compilers uses an even smaller value of 16B. However, with this change, I suggest to apply different (and smaller) values for [Constants] and [Stub Code] section alignments. This makes overall code 2% smaller on AARCH.
>>
>> The correctness of the changes is checked by jtreg. Performance tested by Renaissance and SpecJBB benchmarkds on AARCH and AMD.
>>
>> Example. Dummy method disassembly on AARCH, before vs after:
>>
>> [Verified Entry Point] | [Verified Entry Point]
>> 78c63b80: nop | 7437e480: nop
>> 78c63b84: sub x9, sp, #0x20, lsl #12 | 7437e484: sub x9, sp, #0x20, lsl #12
>> 78c63b88: str xzr, [x9] | 7437e488: str xzr, [x9]
>> 78c63b8c: sub sp, sp, #0x20 | 7437e48c: sub sp, sp, #0x20
>> 78c63b90: stp x29, x30, [sp, #16] | 7437e490: stp x29, x30, [sp, #16]
>> 78c63b94: orr w1, wzr, #0x10 | 7437e494: orr w1, wzr, #0x10
>> 78c63b98: bl 78343e00 | 7437e498: bl 73a61980
>> 78c63b9c: .inst 0x00000000 ; undefined | 7437e49c: .inst 0x00000000 ; undefined
>> 78c63ba0: .inst 0x00000000 ; undefined |
>> 78c63ba4: .inst 0x00000000 ; undefined |
>> 78c63ba8: .inst 0x00000000 ; undefined |
>> 78c63bac: .inst 0x00000000 ; undefined |
>> 78c63bb0: .inst 0x00000000 ; undefined |
>> 78c63bb4: .inst 0x00000000 ; undefined |
>> 78c63bb8: .inst 0x00000000 ; undefined |
>> 78c63bbc: .inst 0x00000000 ; undefined |
>> [Stub Code] | [Stub Code]
>> 78c63bc0: ldr x8, 78c63bc8 | 7437e4a0: ldr x8, 7437e4a8
>> 78c63bc4: br x8 | 7437e4a4: br x8
>> 78c63bc8: .inst 0x78343e00 ; undefined | 7437e4a8: .inst 0x73a61980 ; undefined
>> 78c63bcc: .inst ; undefined | 7437e4ac: .inst ; undefined
>> [Exception Handler] | [Exception Handler]
>> 78c63bd0: b 783ee080 | 7437e4b0: b 73b0c100
>> [Deopt Handler Code] | [Deopt Handler Code]
>> 78c63bd4: adr x30, 78c63bd4 | 7437e4b4: adr x30, 7437e4b4
>> 78c63bd8: b 78343ac0 | 7437e4b8: b 73a61620
>> 78c63bdc: .inst 0x00000000 ; undefined | 7437e4bc: .inst 0x00000000 ; undefined
>
> src/hotspot/cpu/x86/sharedRuntime_x86_32.cpp line 2635:
>
>> 2633: // allocate space for the code
>> 2634: // setup code generation tools
>> 2635: CodeBuffer buffer("handler_blob", 2048, 1024);
>
> Why you need to double the size?
I did it to fix the CodeBuffer overflow assert on x86 build. SharedRuntime::generate_handler_blob asks a 1KB buffer -
actually CodeBuffer::initialize reserves the given size with some extra bytes; with the new formula (code_size + align + slop * SECT_LIMIT) the extra bytes number is reduced, causing the handler_blob generator buffer to overflow. My solution is to double the code_size estimate for handler_blob generator buffer. The same 2048/1024 numbers we can see on other platforms - aarch, pcc, amd, riscv.
-------------
PR: https://git.openjdk.java.net/jdk/pull/8453
More information about the hotspot-dev
mailing list