[foreign-memaccess+abi] RFR: 8274648: Improve logic for acquiring by reference parameters in downcall handles [v2]
Jorn Vernee
jvernee at openjdk.java.net
Tue Oct 5 11:32:21 UTC 2021
On Fri, 1 Oct 2021 15:26:34 GMT, Maurizio Cimadamore <mcimadamore at openjdk.org> wrote:
>> The current logic for acquiring/releasing scopes associated with by-reference parameters in a downcall handle call is a bit naive, in the sense that it acquires/releases each parameter in isolation (possibly redundantly).
>>
>> I've been working on a way to ameliorate the situation by adapting the downcall method handle so that we can pass all addressable parameters (in bulk) to a single acquire/release function. Since this function sees all arguments at once, it can decide to only acquire the unique scopes.
>>
>> How to do that proved to be challenging - as simply using a for loop didn't give the performance I expected. Turns out that as soon as we take a backward branch, there is a regression compared with the base case, even when all scopes are the same. The only case where we gain is when all scopes are shared.
>>
>> But I (re)discovered a trick; we can special case the acquire/release functions so that they only use a loop for a high number of addressable parameters, and use a switch (with fallthrough) instead for low counts. This makes the logic more scrutable and the performance numbers are stricty better than what we had:
>>
>>
>> BEFORE
>>
>> Benchmark
>> CallOverheadConstant.panama_identity_struct_ref_confined avgt 30 11.544 ? 0.170 ns/op
>> CallOverheadConstant.panama_identity_struct_ref_confined_3 avgt 30 12.213 ? 0.228 ns/op
>> CallOverheadConstant.panama_identity_struct_ref_shared avgt 30 17.214 ? 0.560 ns/op
>> CallOverheadConstant.panama_identity_struct_ref_shared_3 avgt 30 33.132 ? 0.934 ns/op
>>
>> AFTER
>>
>> Benchmark Mode Cnt Score Error Units
>> CallOverheadConstant.panama_identity_struct_ref_confined avgt 30 11.613 ? 0.326 ns/op
>> CallOverheadConstant.panama_identity_struct_ref_confined_3 avgt 30 11.646 ? 0.333 ns/op
>> CallOverheadConstant.panama_identity_struct_ref_shared avgt 30 17.763 ? 0.639 ns/op
>> CallOverheadConstant.panama_identity_struct_ref_shared_3 avgt 30 17.087 ? 0.475 ns/op
>>
>>
>> As you can see, in both cases there is no cost for passing multiple addressable arguments backed by the same scope.
>
> Maurizio Cimadamore has updated the pull request incrementally with one additional commit since the last revision:
>
> Remove unused imports
src/jdk.incubator.foreign/share/classes/jdk/internal/foreign/abi/SharedUtils.java line 447:
> 445: for (int i = 5 ; i < args.length ; i++) {
> 446: acquire(args[i].scope());
> 447: }
I think this will skip acquiring the first `5` scopes right? e.g. if `length` is 6, the switch jumps immediately into the `default` case. In other words, I think the for loop should start at `0`.
src/jdk.incubator.foreign/share/classes/jdk/internal/foreign/abi/SharedUtils.java line 485:
> 483: for (int i = 5 ; i < args.length ; i++) {
> 484: release(args[i].scope());
> 485: }
Same here.
-------------
PR: https://git.openjdk.java.net/panama-foreign/pull/590
More information about the panama-dev
mailing list