RFR: 8370947: Mitigate Neoverse-N1 erratum 1542419 negative impact on GenZGC performance
Evgeny Astigeevich
eastigeevich at openjdk.org
Fri Nov 21 23:02:07 UTC 2025
On Thu, 20 Nov 2025 21:23:16 GMT, Dean Long <dlong at openjdk.org> wrote:
> It seems a little disruptive to have to pass `defer_icache_invalidation` around so much. What about attaching this information to the Thread or using a THREAD_LOCAL?
I switched to a THREAD_LOCAL. Initially it regressed fullGG comparing to the version with the parameter:
- Parameter:
Benchmark (accessedFieldCount) (methodCount) Mode Cnt Score Error Units
GCPatchingNmethodCost.fullGC 0 5000 avgt 3 88.865 ± 19.299 ms/op
GCPatchingNmethodCost.fullGC 2 5000 avgt 3 146.184 ± 11.531 ms/op
GCPatchingNmethodCost.fullGC 4 5000 avgt 3 186.429 ± 16.257 ms/op
GCPatchingNmethodCost.fullGC 8 5000 avgt 3 262.933 ± 13.071 ms/op
- THREAD_LOCAL
Benchmark (accessedFieldCount) (methodCount) Mode Cnt Score Error Units
GCPatchingNmethodCost.fullGC 0 5000 avgt 3 93.899 ± 14.870 ms/op
GCPatchingNmethodCost.fullGC 2 5000 avgt 3 152.872 ± 13.566 ms/op
GCPatchingNmethodCost.fullGC 4 5000 avgt 3 194.425 ± 37.851 ms/op
GCPatchingNmethodCost.fullGC 8 5000 avgt 3 271.826 ± 47.908 ms/op
I found that `ZBarrierSetAssembler::patch_barrier_relocation` is only used when icache invalidation is deferred. I replaced a check of the thread local value with a check of `NeoverseN1Errata1542419`. This restored the performance:
Benchmark (accessedFieldCount) (methodCount) Mode Cnt Score Error Units
GCPatchingNmethodCost.fullGC 0 5000 avgt 3 84.919 ± 31.411 ms/op
GCPatchingNmethodCost.fullGC 2 5000 avgt 3 141.862 ± 7.026 ms/op
GCPatchingNmethodCost.fullGC 4 5000 avgt 3 184.921 ± 46.592 ms/op
GCPatchingNmethodCost.fullGC 8 5000 avgt 3 263.897 ± 48.271 ms/op
It might be that accesses to THREAD_LOCAL on Neoverse N1 are expensive.
Should I try attaching info to Thread?
-------------
PR Comment: https://git.openjdk.org/jdk/pull/28328#issuecomment-3564915607
More information about the hotspot-dev
mailing list