RFR: 8334060: Implementation of Late Barrier Expansion for G1 [v2]

Martin Doerr mdoerr at openjdk.org
Tue Aug 27 17:41:13 UTC 2024


On Tue, 27 Aug 2024 12:36:39 GMT, Roberto Castañeda Lozano <rcastanedalo at openjdk.org> wrote:

>>> Sure, I agree that g1GetAndSetP and g1GetAndSetN are more consistent with the corresponding Ideal operation names, will rename the instructions in x64 and aarch64 and update the test expectations.
>> 
>> Done (commit daf38d3).
>> 
>> @offamitkumar @feilongjiang @snazarkin please note that the ADL instructions `g1XChgP` and `g1XChgN` have been renamed to `g1GetAndSetP` and `g1GetAndSetN`, and the same naming is expected across all platforms by the test `compiler/gcbarriers/TestG1BarrierGeneration.java` included in this changeset.
>
>> Regardless if you implement the compressed oops optimization or not, I'd reconsider moving the oop decoding into G1BarrierSetAssembler::g1_write_barrier_post_c2 because it makes the .ad file shorter because you can get rid of the replicated decode_heap_oop.
> 
> I tried this refactoring [here](https://github.com/openjdk/jdk/commit/d4e83fd7d77c5415700b33556752e8c8da811dea), thanks again Martin for the suggestion. In my opinion, the result is similar in terms of readability/maintainability because the benefit of removing explicit `decode_heap_oop` operations in the ADL file is somewhat negated by the increased complexity of the `write_barrier_post` and `g1_write_barrier_post_c2` signatures. For aarch64, moving non-destructive `decode_heap_oop` operations would probably require passing both source and destination registers to these functions explicitly, which would make them more complex. I feel that this refactoring is only amortized when the new-value decoding operations are more complex, as in your PPC implementation.

Thanks for trying! I like your refactored version. I prefer moving stuff out of the .ad files. The complexity of the barrier code is not significantly higher. Note that you could use `decode_heap_oop_not_null(Register dst, Register src)` with `tmp2` as dst (`generate_post_barrier_fast_path` can deal with new_val == tmp2). That would save a move instruction in some cases.
I haven't looked into the aarch64 code. I leave you free to decide.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/19746#discussion_r1733283320


More information about the hotspot-compiler-dev mailing list