RFR: 8334060: Implementation of Late Barrier Expansion for G1 [v2]
Martin Doerr
mdoerr at openjdk.org
Fri Aug 23 13:31:06 UTC 2024
On Mon, 19 Aug 2024 14:25:13 GMT, Roberto Castañeda Lozano <rcastanedalo at openjdk.org> wrote:
>> If case of heap base != null, a branch already exists which makes the other null check redundant. So, we have null check, region crossing check, another null check. Maybe this compressed oop mode is not important enough.
>>
>> For the other compressed oop modes, yes, this means moving the null check above the region crossing check. On PPC64, the null check can be combined with the shift instruction, so we save one compare instruction. Technically, it would even be possible to use only one branch instruction for both checks, but I'm not sure if it's worth the complexity. I'll think about it.
>
> OK, thanks. I just ran some benchmarks with zero-based OOP compression ([prototype here](https://github.com/robcasloz/jdk/tree/JDK-8334060-g1-late-barrier-expansion-x64-optimizations)) and could not observe any significant performance effect on three different x64 implementations. I think I will keep the `g1StoreN` implementation as-is in the x64 and aarch64 backends, for simplicity. Again, we can revisit this in follow-up work if need be.
I have an experimental implementation for PPC64. I have moved the oop decoding into `G1BarrierSetAssembler::g1_write_barrier_post_c2`:
https://github.com/TheRealMDoerr/jdk/blob/a48598075862f17e7b1cfbec29af4c2431809257/src/hotspot/cpu/ppc/gc/g1/g1BarrierSetAssembler_ppc.cpp#L476
This has 2 advantages:
- Reduce replicated code in the .ad file.
- Make the discussed optimization easy. Please take a look.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/19746#discussion_r1728978594
More information about the hotspot-compiler-dev
mailing list