RFR: 8252990: Intrinsify Unsafe.storeStoreFence [v2]
Aleksey Shipilev
shade at openjdk.java.net
Thu Oct 28 08:58:48 UTC 2021
> `Unsafe.storeStoreFence` currently delegates to stronger `Unsafe.storeFence`. We can teach compilers to map this directly to already existing rules that handle `MemBarStoreStore`. Like explicit `LoadFence`/`StoreFence`, we introduce the special node to differentiate explicit fence and implicit store-store barriers. `storeStoreFence` is usually used to simulate safe `final`-field like constructions in special JDK classes, like `ConstantCallSite` and friends.
>
> Motivational performance difference on benchmarks from JDK-8276054 on ARM32:
>
>
> Benchmark Mode Cnt Score Error Units
> Multiple.plain avgt 3 2.669 ± 0.004 ns/op
> Multiple.release avgt 3 16.688 ± 0.057 ns/op
> Multiple.storeStore avgt 3 14.021 ± 0.144 ns/op // Better
>
> MultipleWithLoads.plain avgt 3 4.672 ± 0.053 ns/op
> MultipleWithLoads.release avgt 3 16.689 ± 0.044 ns/op
> MultipleWithLoads.storeStore avgt 3 14.012 ± 0.010 ns/op // Better
>
> MultipleWithStores.plain avgt 3 14.687 ± 0.009 ns/op
> MultipleWithStores.release avgt 3 45.393 ± 0.192 ns/op
> MultipleWithStores.storeStore avgt 3 38.048 ± 0.033 ns/op // Better
>
> Publishing.plain avgt 3 27.079 ± 0.201 ns/op
> Publishing.release avgt 3 27.088 ± 0.241 ns/op
> Publishing.storeStore avgt 3 27.009 ± 0.259 ns/op // Within error, hidden by allocation
>
> Single.plain avgt 3 2.670 ± 0.002 ns/op
> Single.releaseFence avgt 3 6.675 ± 0.001 ns/op
> Single.storeStoreFence avgt 3 8.012 ± 0.027 ns/op // Worse, seems to be ARM32 implementation artifact
>
>
> As expected, this does not affect x86_64 at all, because both `release` and `storeStore` are effectively no-ops, only affecting compiler optimizations:
>
>
> Benchmark Mode Cnt Score Error Units
>
> Multiple.plain avgt 3 0.406 ± 0.002 ns/op
> Multiple.release avgt 3 0.409 ± 0.018 ns/op
> Multiple.storeStore avgt 3 0.406 ± 0.001 ns/op
>
> MultipleWithLoads.plain avgt 3 4.328 ± 0.006 ns/op
> MultipleWithLoads.release avgt 3 4.600 ± 0.014 ns/op
> MultipleWithLoads.storeStore avgt 3 4.602 ± 0.006 ns/op
>
> MultipleWithStores.plain avgt 3 0.812 ± 0.001 ns/op
> MultipleWithStores.release avgt 3 0.812 ± 0.002 ns/op
> MultipleWithStores.storeStore avgt 3 0.812 ± 0.002 ns/op
>
> Publishing.plain avgt 3 6.370 ± 0.059 ns/op
> Publishing.release avgt 3 6.358 ± 0.436 ns/op
> Publishing.storeStore avgt 3 6.367 ± 0.054 ns/op
>
> Single.plain avgt 3 0.407 ± 0.039 ns/op
> Single.releaseFence avgt 3 0.406 ± 0.001 ns/op
> Single.storeStoreFence avgt 3 0.406 ± 0.001 ns/op
>
>
> Additional testing:
> - [x] Linux x86_64 fastdebug `tier1`
Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision:
Fix the comment to match JDK-8276096
-------------
Changes:
- all: https://git.openjdk.java.net/jdk/pull/6136/files
- new: https://git.openjdk.java.net/jdk/pull/6136/files/5277aa2a..1f24d71c
Webrevs:
- full: https://webrevs.openjdk.java.net/?repo=jdk&pr=6136&range=01
- incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=6136&range=00-01
Stats: 2 lines in 1 file changed: 0 ins; 1 del; 1 mod
Patch: https://git.openjdk.java.net/jdk/pull/6136.diff
Fetch: git fetch https://git.openjdk.java.net/jdk pull/6136/head:pull/6136
PR: https://git.openjdk.java.net/jdk/pull/6136
More information about the core-libs-dev
mailing list