RFR: 8370947: Mitigate Neoverse-N1 erratum 1542419 negative impact on GenZGC performance

Evgeny Astigeevich eastigeevich at openjdk.org
Thu Nov 20 14:58:07 UTC 2025


Arm Neoverse N1 erratum 1542419: "The core might fetch a stale instruction from memory which violates the ordering of instruction fetches". It is fixed in Neoverse N1 r4p1.
 
Neoverse-N1 implementations mitigate erratum 1542419 with a workaround:
- Disable coherent icache.
- Trap IC IVAU instructions.
- Execute:
   - `tlbi vae3is, xzr`
   - `dsb sy`
 
 `tlbi vae3is, xzr` invalidates translations for all address spaces (global for address).  It waits for all memory accesses using in-scope old translation information to complete before it is considered complete.
 
As this workaround has significant overhead, Arm Neoverse N1 (MP050) Software Developer Errata Notice version 29.0 suggests:

"Since one TLB inner-shareable invalidation is enough to avoid this erratum, the number of injected TLB invalidations should be minimized in the trap handler to mitigate the performance impact due to this workaround."

This PR introduces a mechanism to defer instruction cache (ICache) invalidation for AArch64 to address the Arm Neoverse N1 erratum 1542419, which causes significant performance overhead if ICache invalidation is performed too frequently. The implementation includes detection of affected Neoverse N1 CPUs and automatic enabling of the workaround for relevant Neoverse N1 revisions.

Changes include:

* Added a new diagnostic JVM flag `NeoverseN1Errata1542419` to enable or disable the workaround for the erratum. The flag is automatically enabled for Neoverse N1 CPUs prior to r4p1, as detected during VM initialization.
* Introduced the `ICacheInvalidationContext` class to manage deferred ICache invalidation, with platform-specific logic for AArch64. This context is used to batch ICache invalidations, reducing performance impact. As the address for icache invalidation is not relevant, we use the nmethod's code start address.
* Provided a default (no-op) implementation for `ICacheInvalidationContext` on platforms where the workaround is not needed, ensuring portability and minimal impact on other architectures.
* Modified barrier patching and relocation logic (`ZBarrierSetAssembler`, `ZNMethod`, `RelocIterator`, and related code) to accept a `defer_icache_invalidation` parameter, allowing ICache invalidation to be deferred and later performed in bulk.

Benchmarking results: Neoverse-N1 r3p1 (Graviton 2)

- Baseline

$ taskset -c 0-3 java -Xbootclasspath/a:./wb.jar -XX:+UnlockDiagnosticVMOptions -XX:-NeoverseN1Errata1542419 -XX:+UseZGC -XX:ZYoungGCThreads=1 -XX:ZOldGCThreads=1 -jar benchmarks.jar org.openjdk.bench.vm.gc.GCPatchingNmethodCost

Benchmark                       (accessedFieldCount)  (methodCount)  Mode  Cnt     Score     Error  Units
GCPatchingNmethodCost.fullGC                       0           5000  avgt    3    73.937 ±  17.764  ms/op
GCPatchingNmethodCost.fullGC                       2           5000  avgt    3   648.331 ±  85.773  ms/op
GCPatchingNmethodCost.fullGC                       4           5000  avgt    3  1221.186 ±  72.401  ms/op
GCPatchingNmethodCost.fullGC                       8           5000  avgt    3  2336.644 ± 446.816  ms/op
GCPatchingNmethodCost.systemGC                     0           5000  avgt    3    77.495 ±  11.963  ms/op
GCPatchingNmethodCost.systemGC                     2           5000  avgt    3   662.447 ± 231.244  ms/op
GCPatchingNmethodCost.systemGC                     4           5000  avgt    3  1217.174 ± 232.325  ms/op
GCPatchingNmethodCost.systemGC                     8           5000  avgt    3  2339.458 ± 271.820  ms/op
GCPatchingNmethodCost.youngGC                      0           5000  avgt    3     9.955 ±   1.649  ms/op
GCPatchingNmethodCost.youngGC                      2           5000  avgt    3   163.623 ±  42.342  ms/op
GCPatchingNmethodCost.youngGC                      4           5000  avgt    3   318.399 ±  87.674  ms/op
GCPatchingNmethodCost.youngGC                      8           5000  avgt    3   618.169 ± 191.474  ms/op


- Fix

$ taskset -c 0-3 java -Xbootclasspath/a:./wb.jar -XX:+UnlockDiagnosticVMOptions -XX:+NeoverseN1Errata1542419 -XX:+UseZGC -XX:ZYoungGCThreads=1 -XX:ZOldGCThreads=1 -jar benchmarks.jar org.openjdk.bench.vm.gc.GCPatchingNmethodCost

Benchmark                       (accessedFieldCount)  (methodCount)  Mode  Cnt    Score    Error  Units
GCPatchingNmethodCost.fullGC                       0           5000  avgt    3   88.865 ± 19.299  ms/op
GCPatchingNmethodCost.fullGC                       2           5000  avgt    3  146.184 ± 11.531  ms/op
GCPatchingNmethodCost.fullGC                       4           5000  avgt    3  186.429 ± 16.257  ms/op
GCPatchingNmethodCost.fullGC                       8           5000  avgt    3  262.933 ± 13.071  ms/op
GCPatchingNmethodCost.systemGC                     0           5000  avgt    3   90.572 ± 14.750  ms/op
GCPatchingNmethodCost.systemGC                     2           5000  avgt    3  148.335 ± 21.456  ms/op
GCPatchingNmethodCost.systemGC                     4           5000  avgt    3  190.828 ± 12.268  ms/op
GCPatchingNmethodCost.systemGC                     8           5000  avgt    3  265.768 ± 46.669  ms/op
GCPatchingNmethodCost.youngGC                      0           5000  avgt    3   10.219 ±  0.877  ms/op
GCPatchingNmethodCost.youngGC                      2           5000  avgt    3   19.035 ±  2.699  ms/op
GCPatchingNmethodCost.youngGC                      4           5000  avgt    3   26.005 ±  2.179  ms/op
GCPatchingNmethodCost.youngGC                      8           5000  avgt    3   42.322 ± 85.691  ms/op

-------------

Commit messages:
 - Merge branch 'master' into JDK-8370947
 - Add deferred icache invalidation to all places; Add JMH microbench
 - 8370947: Mitigate Neoverse-N1 erratum 1542419 negative impact on GenZGC performance

Changes: https://git.openjdk.org/jdk/pull/28328/files
  Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28328&range=00
  Issue: https://bugs.openjdk.org/browse/JDK-8370947
  Stats: 380 lines in 17 files changed: 340 ins; 4 del; 36 mod
  Patch: https://git.openjdk.org/jdk/pull/28328.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/28328/head:pull/28328

PR: https://git.openjdk.org/jdk/pull/28328


More information about the hotspot-dev mailing list