[foreign-memaccess+abi] RFR: Performance improvement to unchecked segment ofNativeRestricted [v4]

Radoslaw Smogura github.com+7535718+rsmogura at openjdk.java.net
Fri Jan 22 20:58:55 UTC 2021


On Wed, 20 Jan 2021 14:20:39 GMT, Maurizio Cimadamore <mcimadamore at openjdk.org> wrote:

>> Radoslaw Smogura has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   Replaced the stride access with normal VarHandle.
>>   
>>   Added no_align benchmakr, to compare preformance with alignments checks turned off.
>>   ```
>>   Benchmark                                         Mode  Cnt  Score   Error  Units
>>   LoopOverNonConstant.BB_get                        avgt   30  3.892 ? 0.012  ns/op
>>   LoopOverNonConstant.BB_loop                       avgt   30  0.230 ? 0.001  ms/op
>>   LoopOverNonConstant.global_segment_get            avgt   30  3.887 ? 0.008  ns/op
>>   LoopOverNonConstant.global_segment_loop           avgt   30  0.396 ? 0.002  ms/op
>>   LoopOverNonConstant.global_segment_loop_no_align  avgt   30  0.247 ? 0.001  ms/op
>>   LoopOverNonConstant.segment_get                   avgt   30  5.489 ? 0.014  ns/op
>>   LoopOverNonConstant.segment_loop                  avgt   30  0.229 ? 0.001  ms/op
>>   LoopOverNonConstant.segment_loop_readonly         avgt   30  0.236 ? 0.001  ms/op
>>   LoopOverNonConstant.segment_loop_slice            avgt   30  0.241 ? 0.001  ms/op
>>   LoopOverNonConstant.segment_loop_static           avgt   30  0.230 ? 0.001  ms/op
>>   LoopOverNonConstant.unsafe_get                    avgt   30  3.425 ? 0.006  ns/op
>>   LoopOverNonConstant.unsafe_loop                   avgt   30  0.230 ? 0.001  ms/op
>>   ```
>>   Not optimized `ofNativeRestricted`
>>   ```
>>   LoopOverNonConstant.global_segment_get     avgt   30  4.126 ?  0.006  ns/op
>>   LoopOverNonConstant.global_segment_loop    avgt   30  0.603 ?  0.001  ms/op
>>   ```
>
> Looks good for now - we can reassess after the hotspot improvements for long in loops start to have visible effects. Thanks!

Hi all,

I realized that my benchmarks contains small issue. I think @mcimadamore mentioned it, but I did not understood.

The loops are counted using long instead of int. With int I've seen the loop get unrolled and the aligned to 1 loop is as fast as other tests.

I'm not sure if at this stage I c an add modified benchmarks or should I opens new PR?

-------------

PR: https://git.openjdk.java.net/panama-foreign/pull/437


More information about the panama-dev mailing list