[foreign-memaccess+abi] RFR: Split foreign vector load and store by null or not null base
Radoslaw Smogura
duke at openjdk.org
Mon Aug 22 20:16:47 UTC 2022
On Mon, 22 Aug 2022 07:31:57 GMT, Radoslaw Smogura <duke at openjdk.org> wrote:
> Split store / load operation by if checking if base is null
> or not null.
>
> When this happens base in Unsafe is not perceived with mixed
> access by VM, and VM does not insert barriers.
>
> Test results gives the expected values where the case of polluted access is 2x multiplication of normal access.
>
> After
>
> Benchmark (size) Mode Cnt Score Error Units
> MemorySegmentVectorAccess.arrayCopy 1024 avgt 10 7.437 ± 0.195 ns/op
> MemorySegmentVectorAccess.directSegments 1024 avgt 10 15.593 ± 0.371 ns/op
> MemorySegmentVectorAccess.heapSegments 1024 avgt 10 16.997 ± 0.118 ns/op
> MemorySegmentVectorAccess.pollutedSegments2 1024 avgt 10 58.673 ± 105.783 ns/op
> MemorySegmentVectorAccess.pollutedSegments3 1024 avgt 10 67.216 ± 16.157 ns/op
> MemorySegmentVectorAccess.pollutedSegments4 1024 avgt 10 122.567 ± 263.950 ns/op
> MemorySegmentVectorAccess.pollutedSegments5 1024 avgt 10 114.725 ± 209.183 ns/op
>
>
> Before
>
> Benchmark (size) Mode Cnt Score Error Units
> MemorySegmentVectorAccess.arrayCopy 1024 avgt 10 8.547 ± 0.115 ns/op
> MemorySegmentVectorAccess.directSegments 1024 avgt 10 15.536 ± 0.082 ns/op
> MemorySegmentVectorAccess.heapSegments 1024 avgt 10 15.818 ± 0.101 ns/op
> MemorySegmentVectorAccess.pollutedSegments2 1024 avgt 10 146.380 ± 1.127 ns/op
> MemorySegmentVectorAccess.pollutedSegments3 1024 avgt 10 290.784 ± 7.274 ns/op
> MemorySegmentVectorAccess.pollutedSegments4 1024 avgt 10 297.187 ± 5.096 ns/op
> MemorySegmentVectorAccess.pollutedSegments5 1024 avgt 10 310.166 ± 9.310 ns/op
>
>
> Additonally with profiling `load` and `store` method arguments as
> described in [1]
>
> Benchmark (size) Mode Cnt Score Error Units
> MemorySegmentVectorAccess.arrayCopy 1024 avgt 10 7.480 ± 0.169 ns/op
> MemorySegmentVectorAccess.directSegments 1024 avgt 10 15.497 ± 0.062 ns/op
> MemorySegmentVectorAccess.heapSegments 1024 avgt 10 16.829 ± 0.132 ns/op
> MemorySegmentVectorAccess.pollutedSegments2 1024 avgt 10 145.436 ± 1.081 ns/op
> MemorySegmentVectorAccess.pollutedSegments3 1024 avgt 10 291.081 ± 2.297 ns/op
> MemorySegmentVectorAccess.pollutedSegments4 1024 avgt 10 305.388 ± 7.518 ns/op
> MemorySegmentVectorAccess.pollutedSegments5 1024 avgt 10 303.931 ± 3.412 ns/op
>
>
> [1] https://github.com/openjdk/panama-foreign/pull/700
Yes the third result set compares against https://github.com/openjdk/panama-foreign/pull/700.
I'm not sure how to interpret this results and explain that avoiding barriers gives better results as profiling arguments. Maybe it's because barrier prevents some other optimizations.
-------------
PR: https://git.openjdk.org/panama-foreign/pull/711
More information about the panama-dev
mailing list