RFR: 8372285: G1: Micro-optimize x86 barrier code [v3]

Albert Mingkun Yang ayang at openjdk.org
Fri Nov 21 14:23:15 UTC 2025


On Fri, 21 Nov 2025 10:07:35 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:

>> We know from [JDK-8372284](https://bugs.openjdk.org/browse/JDK-8372284) that G1 C2 stubs can take ~10% of total instructions. So minor optimizations in hand-written assembly pay off for code density. This PR does a little x86-specific polishing: `testptr` where possible, short forward branches where possible. I rewired some code to make it abundantly clear the branches in question are short. It also makes clear that lots of the affected methods are essentially fall-through.
>> 
>> The patch is deliberately on simpler side, so we can backport it to 25u, if need arises.
>> 
>> Additional testing:
>>  - [x] Linux x86_64 server fastdebug, `tier1`
>>  - [ ]  Linux x86_64 server fastdebug, `all`
>
> Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Make some backward branches explicitly short

Marked as reviewed by ayang (Reviewer).

src/hotspot/cpu/x86/gc/g1/g1BarrierSetAssembler_x86.cpp line 208:

> 206: 
> 207:   // Jump out if done, or fall-through to runtime.
> 208:   // "Done" is far away, so jump cannot be short.

I believe "Done" refers to `L_done`, so I wonder if we use that directly.

-------------

PR Review: https://git.openjdk.org/jdk/pull/28446#pullrequestreview-3492959775
PR Review Comment: https://git.openjdk.org/jdk/pull/28446#discussion_r2549912395


More information about the hotspot-gc-dev mailing list