RFR: 8307348 - Parallelize heap walk for ObjectCount(AfterGC) JFR event collection [v3]
Aleksey Shipilev
shade at openjdk.org
Wed May 3 12:06:15 UTC 2023
On Wed, 3 May 2023 12:01:13 GMT, olivergillespie <duke at openjdk.org> wrote:
>> ObjectCount(AfterGC) event does a full single-threaded heap scan at a safepoint. After https://bugs.openjdk.org/browse/JDK-8215624, it is trivial to use the parallel version of the heap scan, reducing the time spent at the safepoint, and thus reducing the overhead of this event.
>>
>> The performance improvement is obvious, but just for confirmation, on my 16-core host, at around 1GB occupancy:
>>
>>
>> Before: 770ms ( [3.059s][debug][gc,phases ] GC(13) Report Object Count 770.317ms )
>> After: 92ms ( [2.335s][debug][gc,phases ] GC(13) Report Object Count 91.742ms )
>>
>>
>> Question 1: Should this be the default behaviour for populate_table (use the number active workers as the parallelism, if nothing else specified)?
>>
>> Question 2: Is active_workers the correct value to use here? Or is max_workers more appropriate?
>
> olivergillespie has updated the pull request incrementally with one additional commit since the last revision:
>
> Fix compile error
>
> ```
> === Output from failing command(s) repeated here ===
> * For target hotspot_variant-server_libjvm_objs_gcTrace.o:
> /home/runner/work/jdk/jdk/src/hotspot/share/gc/shared/gcTrace.cpp: In member function 'void GCTracer::report_object_count_after_gc(BoolObjectClosure*)':
> /home/runner/work/jdk/jdk/src/hotspot/share/gc/shared/gcTrace.cpp:114:48: error: invalid use of incomplete type 'class CollectedHeap'
> 114 | WorkerThreads* workers = Universe::heap()->safepoint_workers();
> | ^~
> In file included from /home/runner/work/jdk/jdk/src/hotspot/share/gc/shared/gcTrace.cpp:35:
> /home/runner/work/jdk/jdk/src/hotspot/share/memory/universe.hpp:42:7: note: forward declaration of 'class CollectedHeap'
> 42 | class CollectedHeap;
> | ^~~~~~~~~~~~~
>
> * All command lines available in /home/runner/work/jdk/jdk/build/linux-x64/make-support/failure-logs.
> === End of repeated output ===
> ```
Why not just `hi.populate_table(&cit, is_alive_cl, ParallelGCThreads);`, and let the `populate_table` deal with the rest? I think we have a convention that `ParallelGCThreads` is roughly the proxy for the number of GC threads at paused operation.
(It is weird that `HeapInspection::populate_table` uses `safepoint_workers` -- maybe that's for additional isolation from the GC threads -- let's not proliferate it here. `populate_table` also caps the worker count at `max_workers`, which answers one of your questions)
-------------
PR Comment: https://git.openjdk.org/jdk/pull/13774#issuecomment-1532905321
More information about the hotspot-gc-dev
mailing list