RFR: 8322484: 22-b26 Regression in J2dBench-bimg_misc-G1 (and more) on Windows-x64 and macOS-x64
Kim Barrett
kbarrett at openjdk.org
Thu Jan 25 15:17:37 UTC 2024
On Wed, 24 Jan 2024 12:38:09 GMT, Thomas Schatzl <tschatzl at openjdk.org> wrote:
> Hi all,
>
> please review this improvement to managing region pin counts in g1.
>
> Some applications do millions of `Get/ReleasePrimitiveArrayCritical` operations per second, in particular some of the j2dbench benchmarks (e.g. vimg_copyarea 10M/s), on some platforms. Every pin/unpin results in an atomic operation that is the cause for these slowdown.
>
> Only Windows seems to be significantly affected.
>
> This suggested change implements a per-region thread local cache storing the current pin/unpin refcount difference, writing it back only when a thread pins/unpin an object in a different region.
>
> For these benchmarks this often reduces the amount of atomic operations to none, or a few handful; the worst improvement I have seen is that effective atomic operations were reduced to 1/10th. Overall all the j2dbench benchmark scores improve.
>
> There is a remaining issue with the `vimg_shapes_gradient` J2dbench subbenchmark: comparing the original results (before integration of region pinning) with latest jdk23 results, there is a regression of about 5%; this is caused by the backout of a bad compiler change (https://bugs.openjdk.org/browse/JDK-8322985). This will be fixed by its redo CR https://bugs.openjdk.org/browse/JDK-8323116.
>
> Testing: tier1-3, j2dbench, dacapo, specj*, renaissance benchmarks
>
> Thanks,
> Thomas
Changes requested by kbarrett (Reviewer).
src/hotspot/share/gc/g1/g1RegionPinCache.hpp line 31:
> 29: #include "utilities/globalDefinitions.hpp"
> 30:
> 31: // Holds the pinned object count increment for the given region for a Java thread.
s/increment/adjustment/ ? Also perhaps s/the pinned/the pending pinned/.
src/hotspot/share/gc/g1/g1RegionPinCache.hpp line 34:
> 32: // I.e. the _count value may actually be negative temporarily if pinning operations
> 33: // were interleaved between two regions.
> 34: class G1RegionPinCache : public StackObj {
A bit more discussion of the rationale would probably make this easier to understand. That is, this avoids
updating the global pinned count for a region (an atomic operation) in the very common case that a thread
increments and then decrements pinning for an object (so its containing region) without any other intervening
pinning operations or garbage collections.
src/hotspot/share/gc/g1/g1RegionPinCache.inline.hpp line 42:
> 40: inline void G1RegionPinCache::dec_count(uint region_idx) {
> 41: if (region_idx == _region_idx) {
> 42: --_count;
I think _count should never underflow here. Consider adding an assertion.
-------------
PR Review: https://git.openjdk.org/jdk/pull/17552#pullrequestreview-1843949540
PR Review Comment: https://git.openjdk.org/jdk/pull/17552#discussion_r1466514630
PR Review Comment: https://git.openjdk.org/jdk/pull/17552#discussion_r1466523130
PR Review Comment: https://git.openjdk.org/jdk/pull/17552#discussion_r1466512776
More information about the hotspot-gc-dev
mailing list