[foreign-memaccess+abi] RFR: 8274648: Improve logic for acquiring by reference parameters in downcall handles [v3]

Jorn Vernee jvernee at openjdk.java.net
Tue Oct 5 13:13:18 UTC 2021


On Tue, 5 Oct 2021 12:02:32 GMT, Maurizio Cimadamore <mcimadamore at openjdk.org> wrote:

>> The current logic for acquiring/releasing scopes associated with by-reference parameters in a downcall handle call is a bit naive, in the sense that it acquires/releases each parameter in isolation (possibly redundantly).
>> 
>> I've been working on a way to ameliorate the situation by adapting the downcall method handle so that we can pass all addressable parameters (in bulk) to a single acquire/release function. Since this function sees all arguments at once, it can decide to only acquire the unique scopes.
>> 
>> How to do that proved to be challenging - as simply using a for loop didn't give the performance I expected. Turns out that as soon as we take a backward branch, there is a regression compared with the base case, even when all scopes are the same. The only case where we gain is when all scopes are shared.
>> 
>> But I (re)discovered a trick; we can special case the acquire/release functions so that they only use a loop for a high number of addressable parameters, and use a switch (with fallthrough) instead for low counts. This makes the logic more scrutable and the performance numbers are stricty better than what we had:
>> 
>> 
>> BEFORE
>> 
>> Benchmark
>> CallOverheadConstant.panama_identity_struct_ref_confined    avgt   30  11.544 ? 0.170  ns/op
>> CallOverheadConstant.panama_identity_struct_ref_confined_3  avgt   30  12.213 ? 0.228  ns/op
>> CallOverheadConstant.panama_identity_struct_ref_shared      avgt   30  17.214 ? 0.560  ns/op
>> CallOverheadConstant.panama_identity_struct_ref_shared_3    avgt   30  33.132 ? 0.934  ns/op
>> 
>> AFTER
>> 
>> Benchmark                                                   Mode  Cnt   Score   Error  Units
>> CallOverheadConstant.panama_identity_struct_ref_confined    avgt   30  11.613 ? 0.326  ns/op
>> CallOverheadConstant.panama_identity_struct_ref_confined_3  avgt   30  11.646 ? 0.333  ns/op
>> CallOverheadConstant.panama_identity_struct_ref_shared      avgt   30  17.763 ? 0.639  ns/op
>> CallOverheadConstant.panama_identity_struct_ref_shared_3    avgt   30  17.087 ? 0.475  ns/op
>> 
>> 
>> As you can see, in both cases there is no cost for passing multiple addressable arguments backed by the same scope.
>
> Maurizio Cimadamore has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains ten commits:
> 
>  - Merge branch 'foreign-memaccess+abi' into acquire_linker_optimized
>  - Put default label first, and use fallthrough
>  - Remove unused imports
>  - cleanup code and sprinkle some comments
>  - Bump up the number of addressable supported in the fast path
>  - Special case calls with up to 4 by-ref parameters.
>    Use a switch up to 4 args, then revert to a loop for further args.
>  - Merge branch 'foreign-memaccess+abi' into acquire_linker_optimized
>  - Simplify code.
>    For some reason doesn't perform great unless we skip the very first argument...
>  - Reduce number of acquires if possible

Marked as reviewed by jvernee (Committer).

-------------

PR: https://git.openjdk.java.net/panama-foreign/pull/590


More information about the panama-dev mailing list