RFR: 8372285: G1: Micro-optimize x86 barrier code [v4]

Aleksey Shipilev shade at openjdk.org
Mon Nov 24 09:45:17 UTC 2025


On Fri, 21 Nov 2025 18:27:14 GMT, Vladimir Kozlov <kvn at openjdk.org> wrote:

>> Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 10 additional commits since the last revision:
>> 
>>  - Adjust label name
>>  - Merge branch 'master' into JDK-8372285-g1-barrier-micro
>>  - Make some backward branches explicitly short
>>  - Comment
>>  - Shorten a few more branches
>>  - Also reflow generate_pre_barrier_slow_path, so it is obvious the branches are short
>>  - More touchups
>>  - Also optimize queue insertion
>>  - Touchups
>>  - WIP
>
> src/hotspot/cpu/x86/gc/g1/g1BarrierSetAssembler_x86.cpp line 193:
> 
>> 191:   // Is the previous value null?
>> 192:   __ testptr(pre_val, pre_val);
>> 193:   __ jccb(Assembler::equal, L_null);
> 
> I know that this short jump will be fused to one instruction with testptr on modern x86. But you will have jump-to-jump sequence. So you may win size wise but "throughput" could be worser. Especially if it is "fast" path.
> 
> Can you check performance of these changes vs using `jcc(Assembler::equal, L_done);` here.

Well, this is technically a slow-path, I have not been able to measure any performance impact on targeted write barrier microbenchmarks. This place contributes about 0.14 pp to code size, though, so it might be a wash in the grand scheme of things:


# baseline
  nmethod code size         :  5744336 bytes
  nmethod code size         :  5744304 bytes
  nmethod code size         :  5738864 bytes

# short (-1.65%)
  nmethod code size         :  5650688 bytes
  nmethod code size         :  5650656 bytes
  nmethod code size         :  5650688 bytes

# long (-1.51%)
  nmethod code size         :  5658856 bytes
  nmethod code size         :  5658856 bytes
  nmethod code size         :  5658856 bytes


I reverted back to `jcc(..., L_done)` to avoid any perf regression risk.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/28446#discussion_r2555449065


More information about the hotspot-dev mailing list