[foreign-memaccess+abi] RFR: Prevent maxAlign virtual calls for polluted accesses [v2]
Radoslaw Smogura
duke at openjdk.org
Tue Aug 9 20:01:12 UTC 2022
On Mon, 8 Aug 2022 02:45:07 GMT, Radoslaw Smogura <duke at openjdk.org> wrote:
>> In case of polluted accesses (when different kinds of segments are accessed
>> from same code), `maxAlign()` can get virtual call which would prevent
>> effective inlining and loop optimizations.
>>
>> This patch moves `maxAlign` to `AbstractMemorySegmentImpl` field, and makes method
>> final. The value of align is passed as constructor argument.
>>
>> _Note: This patch can cause slightly bigger memory usage, as memory segment will carry `maxAlign` value, this can optimizaed by using smaller container for value i. e. `byte` or `short`_
>>
>> After
>>
>> Benchmark (size) Mode Cnt Score Error Units
>> MixedAccessBenchmarks.directCopy 1048576 avgt 10 16410.733 ± 79.901 ns/op
>> MixedAccessBenchmarks.pollutedAccessCopy 1048576 avgt 10 168497.502 ± 632.578 ns/op
>>
>>
>> Before
>>
>> Benchmark (size) Mode Cnt Score Error Units
>> MixedAccessBenchmarks.directCopy 1048576 avgt 10 18336.054 ± 63.133 ns/op
>> MixedAccessBenchmarks.pollutedAccessCopy 1048576 avgt 10 2069032.456 ± 167512.633 ns/op
>
> Radoslaw Smogura has updated the pull request incrementally with one additional commit since the last revision:
>
> Previous version created performance drop for `LoopOverNonConstantHeap` benchamrk, this fixes this and keeps same results for tests with vectors.
I made tests with using this flag `-XX:TypeProfileLevel=2` my understanding is that this flag should enable all profiling. Results for polluted case [1] are similar
Here are results for polluted case:
Iteration 5: 212232.300 ns/op
Iteration 6: 11157.340 ns/op
It does not matter if flag is turned on or off, or if it's with or without PR changes.
[1] https://gist.github.com/rsmogura/42bcb22ca55a730c0c46a4035f179422
-------------
PR: https://git.openjdk.org/panama-foreign/pull/700
More information about the panama-dev
mailing list