RFR: 8342382: Implementation of JEP G1: Improve Application Throughput with a More Efficient Write-Barrier [v30]
Thomas Schatzl
tschatzl at openjdk.org
Wed Apr 9 12:50:42 UTC 2025
On Wed, 9 Apr 2025 11:34:09 GMT, Roberto Castañeda Lozano <rcastanedalo at openjdk.org> wrote:
>> Thomas Schatzl has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 39 commits:
>>
>> - * missing file from merge
>> - Merge branch 'master' into 8342382-card-table-instead-of-dcq
>> - Merge branch 'master' into 8342382-card-table-instead-of-dcq
>> - Merge branch 'master' into 8342382-card-table-instead-of-dcq
>> - Merge branch 'master' into submit/8342382-card-table-instead-of-dcq
>> - * make young gen length revising independent of refinement thread
>> * use a service task
>> * both refinement control thread and young gen length revising use the same infrastructure to get the number of available bytes and determine the time to the next update
>> - * fix IR code generation tests that change due to barrier cost changes
>> - * factor out card table and refinement table merging into a single
>> method
>> - Merge branch 'master' into 8342382-card-table-instead-of-dcq3
>> - * obsolete G1UpdateBufferSize
>>
>> G1UpdateBufferSize has previously been used to size the refinement
>> buffers and impose a minimum limit on the number of cards per thread
>> that need to be pending before refinement starts.
>>
>> The former function is now obsolete with the removal of the dirty
>> card queues, the latter functionality has been taken over by the new
>> diagnostic option `G1PerThreadPendingCardThreshold`.
>>
>> I prefer to make this a diagnostic option is better than a product option
>> because it is something that is only necessary for some test cases to
>> produce some otherwise unwanted behavior (continuous refinement).
>>
>> CSR is pending.
>> - ... and 29 more: https://git.openjdk.org/jdk/compare/41d4a0d7...1c5a669f
>
> src/hotspot/cpu/x86/gc/g1/g1BarrierSetAssembler_x86.cpp line 101:
>
>> 99: }
>> 100:
>> 101: void G1BarrierSetAssembler::gen_write_ref_array_post_barrier(MacroAssembler* masm, DecoratorSet decorators,
>
> Have you measured the performance impact of inlining this assembly code instead of resorting to a runtime call as done before? Is it worth the maintenance cost (for every platform), risk of introducing bugs, etc.?
I remember significant impact in some microbenchmark. It's also inlined in Parallel GC. I do not consider it a big issue wrt to maintenance - these things never really change, and the method is small and contained.
I will try to redo numbers.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/23739#discussion_r2035298557
More information about the hotspot-dev
mailing list