RFR: 8370947: Mitigate Neoverse-N1 erratum 1542419 negative impact on GCs and JIT performance [v13]
Aleksey Shipilev
shade at openjdk.org
Wed Dec 3 16:14:05 UTC 2025
On Wed, 3 Dec 2025 15:42:38 GMT, Evgeny Astigeevich <eastigeevich at openjdk.org> wrote:
>> Arm Neoverse N1 erratum 1542419: "The core might fetch a stale instruction from memory which violates the ordering of instruction fetches". It is fixed in Neoverse N1 r4p1.
>>
>> Neoverse-N1 implementations mitigate erratum 1542419 with a workaround:
>> - Disable coherent icache.
>> - Trap IC IVAU instructions.
>> - Execute:
>> - `tlbi vae3is, xzr`
>> - `dsb sy`
>>
>> `tlbi vae3is, xzr` invalidates translations for all address spaces (global for address). It waits for all memory accesses using in-scope old translation information to complete before it is considered complete.
>>
>> As this workaround has significant overhead, Arm Neoverse N1 (MP050) Software Developer Errata Notice version 29.0 suggests:
>>
>> "Since one TLB inner-shareable invalidation is enough to avoid this erratum, the number of injected TLB invalidations should be minimized in the trap handler to mitigate the performance impact due to this workaround."
>>
>> This PR introduces a mechanism to defer instruction cache (ICache) invalidation for AArch64 to address the Arm Neoverse N1 erratum 1542419, which causes significant performance overhead if ICache invalidation is performed too frequently. The implementation includes detection of affected Neoverse N1 CPUs and automatic enabling of the workaround for relevant Neoverse N1 revisions.
>>
>> Changes include:
>>
>> * Added a new diagnostic JVM flag `NeoverseN1Errata1542419` to enable or disable the workaround for the erratum. The flag is automatically enabled for Neoverse N1 CPUs prior to r4p1, as detected during VM initialization.
>> * Introduced the `ICacheInvalidationContext` class to manage deferred ICache invalidation, with platform-specific logic for AArch64. This context is used to batch ICache invalidations, reducing performance impact. As the address for icache invalidation is not relevant, we use the nmethod's code start address.
>> * Provided a default (no-op) implementation for `ICacheInvalidationContext` on platforms where the workaround is not needed, ensuring portability and minimal impact on other architectures.
>> * Modified barrier patching and relocation logic (`ZBarrierSetAssembler`, `ZNMethod`, `RelocIterator`, and related code) to accept a `defer_icache_invalidation` parameter, allowing ICache invalidation to be deferred and later performed in bulk.
>>
>> Benchmarking results: Neoverse-N1 r3p1 (Graviton 2)
>>
>> - Baseline
>>
>> $ taskset -c 0-3 java -Xbootclasspath/a:./wb.jar -XX:+UnlockDiagnosticVMOptions -XX:-NeoverseN1...
>
> Evgeny Astigeevich has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 19 commits:
>
> - Fix linux-cross-compile build aarch64
> - Merge branch 'master' into JDK-8370947
> - Remove trailing whitespaces
> - Add support of deferred icache invalidation to other GCs and JIT
> - Add UseDeferredICacheInvalidation to defer invalidation on CPU with hardware cache coherence
> - Add jtreg test
> - Fix linux-cross-compile aarch64 build
> - Fix regressions for Java methods without field accesses
> - Fix code style
> - Correct ifdef; Add dsb after ic
> - ... and 9 more: https://git.openjdk.org/jdk/compare/3d54a802...4b04496f
Interesting work! I was able to look through it very briefly:
src/hotspot/cpu/aarch64/globals_aarch64.hpp line 133:
> 131: "Enable workaround for Neoverse N1 erratum 1542419") \
> 132: product(bool, UseDeferredICacheInvalidation, false, DIAGNOSTIC, \
> 133: "Defer multiple ICache invalidation to single invalidation") \
Since the `ICacheInvalidationContext` is in shared code, and I suppose x86_64 would also benefit from this (at least eventually), this sounds like `globals.hpp` option.
src/hotspot/share/asm/codeBuffer.cpp line 371:
> 369: !((oop_Relocation*)reloc)->oop_is_immediate()) {
> 370: _has_non_immediate_oops = true;
> 371: }
Honestly, this looks fragile? We can go into nmethods patching for some other reason, not for patching oops.
Also, we still might need to go and patch immediate oops? I see this:
// Instruct loadConP of x86_64.ad places oops in code that are not also
// listed in the oop section.
static bool mustIterateImmediateOopsInCode() { return true; }
Is there a substantial loss is doing icache invalidation without checking for the existence of interesting oops? Do you have an idea how many methods this filters?
src/hotspot/share/asm/codeBuffer.cpp line 939:
> 937: // Move all the code and relocations to the new blob:
> 938: relocate_code_to(&cb);
> 939: }
Here and later, the preferred style is:
Suggestion:
// Move all the code and relocations to the new blob:
{
ICacheInvalidationContext icic(ICacheInvalidation::NOT_NEEDED);
relocate_code_to(&cb);
}
src/hotspot/share/gc/shenandoah/shenandoahCodeRoots.cpp line 37:
> 35: #include "memory/universe.hpp"
> 36: #include "runtime/atomicAccess.hpp"
> 37: #include "runtime/icache.hpp"
Include is added, but no actual use? Is something missing, or this is a leftover include?
test/hotspot/jtreg/gc/TestDeferredICacheInvalidation.java line 28:
> 26:
> 27: /*
> 28: * @test id=ParallelGC
Usually just:
Suggestion:
* @test id=parallel
test/hotspot/jtreg/gc/TestDeferredICacheInvalidation.java line 34:
> 32: * @requires vm.debug
> 33: * @requires os.family=="linux"
> 34: * @requires os.arch=="aarch64"
I am guessing it is more future-proof to drop Linux/AArch64 filters, and rely on test doing the right thing, regardless of the config. I see it already skips when `UseDeferredICacheInvalidation` is off.
test/micro/org/openjdk/bench/vm/gc/GCPatchingNmethodCost.java line 184:
> 182: @Benchmark
> 183: @Warmup(iterations = 0)
> 184: @Measurement(iterations = 1)
Not sure what is the intent here. Maybe you wanted `@BenchmarkMode(OneShot)` instead?
-------------
PR Review: https://git.openjdk.org/jdk/pull/28328#pullrequestreview-3535752098
PR Review Comment: https://git.openjdk.org/jdk/pull/28328#discussion_r2585729392
PR Review Comment: https://git.openjdk.org/jdk/pull/28328#discussion_r2585679778
PR Review Comment: https://git.openjdk.org/jdk/pull/28328#discussion_r2585704068
PR Review Comment: https://git.openjdk.org/jdk/pull/28328#discussion_r2585707389
PR Review Comment: https://git.openjdk.org/jdk/pull/28328#discussion_r2585735476
PR Review Comment: https://git.openjdk.org/jdk/pull/28328#discussion_r2585734553
PR Review Comment: https://git.openjdk.org/jdk/pull/28328#discussion_r2585743873
More information about the shenandoah-dev
mailing list