RFR: 8322484: 22-b26 Regression in J2dBench-bimg_misc-G1 (and more) on Windows-x64 and macOS-x64

Albert Mingkun Yang ayang at openjdk.org
Thu Jan 25 14:27:33 UTC 2024


On Wed, 24 Jan 2024 12:38:09 GMT, Thomas Schatzl <tschatzl at openjdk.org> wrote:

> Hi all,
> 
>   please review this improvement to managing region pin counts in g1.
> 
> Some applications do millions of `Get/ReleasePrimitiveArrayCritical` operations per second, in particular some of the j2dbench benchmarks (e.g. vimg_copyarea 10M/s), on some platforms. Every pin/unpin results in an atomic operation that is the cause for these slowdown. 
> 
> Only Windows seems to be significantly affected.
> 
> This suggested change implements a per-region thread local cache storing the current pin/unpin refcount difference, writing it back only when a thread pins/unpin an object in a different region.
> 
> For these benchmarks this often reduces the amount of atomic operations to none, or a few handful; the worst improvement I have seen is that effective atomic operations were reduced to 1/10th. Overall all the j2dbench benchmark scores improve.
> 
> There is a remaining issue with the `vimg_shapes_gradient` J2dbench subbenchmark: comparing the original results (before integration of region pinning) with latest jdk23 results, there is a regression of about 5%; this is caused by the backout of a bad compiler change (https://bugs.openjdk.org/browse/JDK-8322985). This will be fixed by its redo CR https://bugs.openjdk.org/browse/JDK-8323116.
> 
> Testing: tier1-3, j2dbench, dacapo, specj*, renaissance benchmarks
> 
> Thanks,
>   Thomas

src/hotspot/share/gc/g1/g1CollectedHeap.inline.hpp line 271:

> 269:   assert(obj->is_typeArray(), "must be typeArray");
> 270:   HeapRegion* r = heap_region_containing(obj);
> 271:   uint obj_region_idx = r->hrm_index();

The two can probably be merged to avoid creating `r`.

src/hotspot/share/gc/g1/g1RegionPinCache.cpp line 33:

> 31: G1RegionPinCache::~G1RegionPinCache() {
> 32:   flush();
> 33: }

This file is almost empty; I wonder if this method can be moved to hpp or inline.hpp.

src/hotspot/share/gc/g1/g1YoungCollector.cpp line 544:

> 542:     assert(G1ThreadLocalData::pin_count_cache(thread).count() == 0, "must be flushed");
> 543:   }
> 544: #endif

It's not obvious to see this assert in this context. I wonder if it's cleaner (or more natural) if it's moved to `JavaThreadRetireTLABAndFlushLogs`, where we flush the cache.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/17552#discussion_r1466451482
PR Review Comment: https://git.openjdk.org/jdk/pull/17552#discussion_r1466448927
PR Review Comment: https://git.openjdk.org/jdk/pull/17552#discussion_r1466450541


More information about the hotspot-gc-dev mailing list