RFR: 8322484: 22-b26 Regression in J2dBench-bimg_misc-G1 (and more) on Windows-x64 and macOS-x64
Albert Mingkun Yang
ayang at openjdk.org
Thu Jan 25 14:27:33 UTC 2024
On Wed, 24 Jan 2024 12:38:09 GMT, Thomas Schatzl <tschatzl at openjdk.org> wrote:
> Hi all,
>
> please review this improvement to managing region pin counts in g1.
>
> Some applications do millions of `Get/ReleasePrimitiveArrayCritical` operations per second, in particular some of the j2dbench benchmarks (e.g. vimg_copyarea 10M/s), on some platforms. Every pin/unpin results in an atomic operation that is the cause for these slowdown.
>
> Only Windows seems to be significantly affected.
>
> This suggested change implements a per-region thread local cache storing the current pin/unpin refcount difference, writing it back only when a thread pins/unpin an object in a different region.
>
> For these benchmarks this often reduces the amount of atomic operations to none, or a few handful; the worst improvement I have seen is that effective atomic operations were reduced to 1/10th. Overall all the j2dbench benchmark scores improve.
>
> There is a remaining issue with the `vimg_shapes_gradient` J2dbench subbenchmark: comparing the original results (before integration of region pinning) with latest jdk23 results, there is a regression of about 5%; this is caused by the backout of a bad compiler change (https://bugs.openjdk.org/browse/JDK-8322985). This will be fixed by its redo CR https://bugs.openjdk.org/browse/JDK-8323116.
>
> Testing: tier1-3, j2dbench, dacapo, specj*, renaissance benchmarks
>
> Thanks,
> Thomas
src/hotspot/share/gc/g1/g1CollectedHeap.inline.hpp line 271:
> 269: assert(obj->is_typeArray(), "must be typeArray");
> 270: HeapRegion* r = heap_region_containing(obj);
> 271: uint obj_region_idx = r->hrm_index();
The two can probably be merged to avoid creating `r`.
src/hotspot/share/gc/g1/g1RegionPinCache.cpp line 33:
> 31: G1RegionPinCache::~G1RegionPinCache() {
> 32: flush();
> 33: }
This file is almost empty; I wonder if this method can be moved to hpp or inline.hpp.
src/hotspot/share/gc/g1/g1YoungCollector.cpp line 544:
> 542: assert(G1ThreadLocalData::pin_count_cache(thread).count() == 0, "must be flushed");
> 543: }
> 544: #endif
It's not obvious to see this assert in this context. I wonder if it's cleaner (or more natural) if it's moved to `JavaThreadRetireTLABAndFlushLogs`, where we flush the cache.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/17552#discussion_r1466451482
PR Review Comment: https://git.openjdk.org/jdk/pull/17552#discussion_r1466448927
PR Review Comment: https://git.openjdk.org/jdk/pull/17552#discussion_r1466450541
More information about the hotspot-gc-dev
mailing list