RFR: 8371649: ZGC: AArch64: redundant OrderAccess::fence in ZBarrierSetAssembler::patch_barrier_relocation

Thu Nov 13 11:59:55 UTC 2025

On Wed, 12 Nov 2025 22:22:29 GMT, Erik Österlund <eosterlund at openjdk.org> wrote:

>> Hi Erik (@fisk),
>> Could you also please take a look, just in case the fence was intentionally put there?
>
>> Hi Erik (@fisk),
>> 
>> Could you also please take a look, just in case the fence was intentionally put there?
> 
> The way I look at it, the fence was there for hardware that is unsophisticated enough to require manual cache flushing instead of having cache coherency that understands instruction edits, and at the same time has unsophisticated enough fences that are not speculated across such that the buffered store hits the cache before invalidating the cache, and not after, which would be awkward.
> 
> It is certainly possible that in practice the cache invalidation facilities also do the right level of fencing. So this is mostly just defensive programming.
> 
> If I flip the question around - how confident do you feel on a scale from 1 to 10 that the cache invalidation mechanism guarantees across all implementations, that the preceding store is flushed out to the caches before the cache is flushed? This is an area of the code where I don't want to take chances and slip unless we feel a high level of confidence.

@fisk:
> The way I look at it, the fence was there for hardware that is unsophisticated enough to require manual cache flushing instead of having cache coherency that understands instruction edits, and at the same time has unsophisticated enough fences that are not speculated across such that the buffered store hits the cache before invalidating the cache, and not after, which would be awkward.

Such hardware would violate Arm ARM:
- B2.3 Ordering requirements defined by the formal concurrency model: Same-Cache-Line-ordered-before.
- B2.7.4.2 Synchronization and coherency issues between data and instruction accesses.

Yes, it might be a bug in hardware. Neoverse-N1 errata 1542419 is a good example.
In such a case, this bug should be handled inside `ICache`, if possible. If not possible, there should be something explaining why it's not in `ICache`.

> It is certainly possible that in practice the cache invalidation facilities also do the right level of fencing. So this is mostly just defensive programming.

The issue is that we don't have this fence at other places, where `ICache::invalidate` are used. A reasonable question would be: why don't we use it there?

> If I flip the question around - how confident do you feel on a scale from 1 to 10 that the cache invalidation mechanism guarantees across all implementations, that the preceding store is flushed out to the caches before the cache is flushed? 

10, if we assume correct AArch64 implementation and correct `ICache::invalidate`.

@theRealAph 
> On the other hand, the cost of OrderAccess::fence is small in comparison with `ICache::invalidate_word`, so there's a question about why we're bothering to remove it.

This change is not about performance. It's about logical inconsistency: not using this everywhere, absence of history and contradiction to Arm ARM.

Also, an assumption of a needed fence complicates a fix of [JDK-8370947](https://bugs.openjdk.org/browse/JDK-8370947). See `ICacheInvalidationContext::fence` in Alex's solution:  https://github.com/openjdk/jdk/compare/master...xmas92:jdk:deferred_icache_invalidation

-------------

PR Comment: https://git.openjdk.org/jdk/pull/28244#issuecomment-3527471862