RFR: 8302122: Parallelize TLAB retirement in prologue in G1
Kim Barrett
kbarrett at openjdk.org
Sun Feb 12 03:04:28 UTC 2023
On Thu, 9 Feb 2023 15:58:35 GMT, Thomas Schatzl <tschatzl at openjdk.org> wrote:
> Hi all,
>
> can I have reviews for this change that parallelizes TLAB retirement and log buffer flushing (which [JDK-8297611](https://bugs.openjdk.org/browse/JDK-8297611) suggests?
>
> * the parallelization has been split into java and non-java threads: parallelization of the latter does not make sense given how they are organized in a linked list. However non-java threads are typically very few anyway, and parallelization only starts making sense with low hundreds of threads anyway.
> * `G1BarrierSet::write_region` added a new parameter that describes the JavaThread the write_region/invalidation (not sure why `write_region` isn't just an overload of `invalidate`) is made for. The reason is previously when `BarrierSet::make_parsable()` (not sure why this is called this way either) is called to flush out deferred card marks, the card marks generated by that were added to the *calling thread's* refinement queue. This worked because the calling thread (the worker thread/vm thread) has always been processed after all java threads (which are the only ones that can have deferred card marks). So these deferred card marks' refinement entries were put into the worker thread's refinement queue. After parallelization this is not possible any more unless we deferred non java thread log flushing until after all java threads - I did not want to do that, so the change passes that explicit java thread through to `G1BarrierSet::write_region`. I think this is clearer anyway, and
corresponds to how `G1DirtyCardQueueSet::concatenate_log_and_stats` works.
>
> Testing: tier1-5
>
> Thanks,
> Thomas
A few comments, but overall looks quite nice.
src/hotspot/share/gc/g1/g1BarrierSet.inline.hpp line 74:
> 72: Thread* thread = Thread::current();
> 73: assert(thread->is_Java_thread(), "must be");
> 74: invalidate((JavaThread*)thread, mr);
Use `JavaThread::current()`
src/hotspot/share/gc/g1/g1YoungGCPreEvacuateTasks.cpp line 82:
> 80: ~JavaThreadRetireTLABAndFlushLogs() {
> 81: FREE_C_HEAP_ARRAY(G1ConcurrentRefineStats, _local_refinement_stats);
> 82: FREE_C_HEAP_ARRAY(ThreadLocalAllocStats, _local_tlab_stats);
Consider static-asserting that the destructors for these two types are trivial.
src/hotspot/share/gc/g1/g1YoungGCPreEvacuateTasks.cpp line 104:
> 102: for (uint i = 0; i < _num_workers; i++) {
> 103: _local_tlab_stats[i].reset();
> 104: _local_refinement_stats[i].reset();
Since the elements haven't been constructed yet, they aren't technically objects of the appropriate type.
Correct is to construct them using placement-new, i.e.
::new (&_local_tlab_stats[i]) ThreadLocalAllocStats();
And yeah, I know we have tons of code not doing that.
src/hotspot/share/gc/g1/g1YoungGCPreEvacuateTasks.cpp line 118:
> 116: G1ConcurrentRefineStats refinement_stats() const {
> 117: G1ConcurrentRefineStats result;
> 118: for (uint i = 0; i < _num_workers; i++) {
indentation
src/hotspot/share/gc/g1/g1YoungGCPreEvacuateTasks.cpp line 185:
> 183: total_refinement_stats += _non_java_stats->refinement_stats();
> 184: qset.update_refinement_stats(total_refinement_stats);
> 185:
Shouldn't there be something here to delete `_java_stats` and `_non_java_stats`?
-------------
Changes requested by kbarrett (Reviewer).
PR: https://git.openjdk.org/jdk/pull/12497
More information about the hotspot-gc-dev
mailing list