RFR: 8263087: Add a MethodHandle combinator that switches over a set of MethodHandles
Remi Forax
forax at univ-mlv.fr
Tue Apr 13 21:58:59 UTC 2021
----- Mail original -----
> De: "Jorn Vernee" <jvernee at openjdk.java.net>
> À: "core-libs-dev" <core-libs-dev at openjdk.java.net>
> Envoyé: Mardi 13 Avril 2021 16:59:58
> Objet: Re: RFR: 8263087: Add a MethodHandle combinator that switches over a set of MethodHandles
> On Thu, 8 Apr 2021 18:51:21 GMT, Jorn Vernee <jvernee at openjdk.org> wrote:
>
>> This patch adds a `tableSwitch` combinator that can be used to switch over a set
>> of method handles given an index, with a fallback in case the index is out of
>> bounds, much like the `tableswitch` bytecode. Here is a description of how it
>> works (copied from the javadoc):
>>
>> Creates a table switch method handle, which can be used to switch over a set of
>> target
>> method handles, based on a given target index, called selector.
>>
>> For a selector value of {@code n}, where {@code n} falls in the range {@code [0,
>> N)},
>> and where {@code N} is the number of target method handles, the table switch
>> method
>> handle will invoke the n-th target method handle from the list of target method
>> handles.
>>
>> For a selector value that does not fall in the range {@code [0, N)}, the table
>> switch
>> method handle will invoke the given fallback method handle.
>>
>> All method handles passed to this method must have the same type, with the
>> additional
>> requirement that the leading parameter be of type {@code int}. The leading
>> parameter
>> represents the selector.
>>
>> Any trailing parameters present in the type will appear on the returned table
>> switch
>> method handle as well. Any arguments assigned to these parameters will be
>> forwarded,
>> together with the selector value, to the selected method handle when invoking
>> it.
>>
>> The combinator does not support specifying the starting index, so the switch
>> cases always run from 0 to however many target handles are specified. A
>> starting index can be added manually with another combination step that filters
>> the input index by adding or subtracting a constant from it, which does not
>> affect performance. One of the reasons for not supporting a starting index is
>> that it allows for more lambda form sharing, but also simplifies the
>> implementation somewhat. I guess an open question is if a convenience overload
>> should be added for that case?
>>
>> Lookup switch can also be simulated by filtering the input through an injection
>> function that translates it into a case index, which has also proven to have
>> the ability to have comparable performance to, or even better performance than,
>> a bytecode-native `lookupswitch` instruction. I plan to add such an injection
>> function to the runtime libraries in the future as well. Maybe at that point it
>> could be evaluated if it's worth it to add a lookup switch combinator as well,
>> but I don't see an immediate need to include it in this patch.
>>
>> The current bytecode intrinsification generates a call for each switch case,
>> which guarantees full inlining of the target method handles. Alternatively we
>> could only have 1 callsite at the end of the switch, where each case just loads
>> the target method handle, but currently this does not allow for inlining of the
>> handles, since they are not constant.
>>
>> Maybe a future C2 optimization could look at the receiver input for invokeBasic
>> call sites, and if the input is a phi node, clone the call for each constant
>> input of the phi. I believe that would allow simplifying the bytecode without
>> giving up on inlining.
>>
>> Some numbers from the added benchmarks:
>>
>> Benchmark (numCases) (offset) (sorted)
>> Mode Cnt Score Error Units
>> MethodHandlesTableSwitchConstant.testSwitch 5 0 N/A
>> avgt 30 4.186 � 0.054 ms/op
>> MethodHandlesTableSwitchConstant.testSwitch 5 150 N/A
>> avgt 30 4.164 � 0.057 ms/op
>> MethodHandlesTableSwitchConstant.testSwitch 10 0 N/A
>> avgt 30 4.124 � 0.023 ms/op
>> MethodHandlesTableSwitchConstant.testSwitch 10 150 N/A
>> avgt 30 4.126 � 0.025 ms/op
>> MethodHandlesTableSwitchConstant.testSwitch 25 0 N/A
>> avgt 30 4.137 � 0.042 ms/op
>> MethodHandlesTableSwitchConstant.testSwitch 25 150 N/A
>> avgt 30 4.113 � 0.016 ms/op
>> MethodHandlesTableSwitchConstant.testSwitch 50 0 N/A
>> avgt 30 4.118 � 0.028 ms/op
>> MethodHandlesTableSwitchConstant.testSwitch 50 150 N/A
>> avgt 30 4.127 � 0.019 ms/op
>> MethodHandlesTableSwitchConstant.testSwitch 100 0 N/A
>> avgt 30 4.116 � 0.013 ms/op
>> MethodHandlesTableSwitchConstant.testSwitch 100 150 N/A
>> avgt 30 4.121 � 0.020 ms/op
>> MethodHandlesTableSwitchOpaqueSingle.testSwitch 5 0 N/A
>> avgt 30 4.113 � 0.009 ms/op
>> MethodHandlesTableSwitchOpaqueSingle.testSwitch 5 150 N/A
>> avgt 30 4.149 � 0.041 ms/op
>> MethodHandlesTableSwitchOpaqueSingle.testSwitch 10 0 N/A
>> avgt 30 4.121 � 0.026 ms/op
>> MethodHandlesTableSwitchOpaqueSingle.testSwitch 10 150 N/A
>> avgt 30 4.113 � 0.021 ms/op
>> MethodHandlesTableSwitchOpaqueSingle.testSwitch 25 0 N/A
>> avgt 30 4.129 � 0.028 ms/op
>> MethodHandlesTableSwitchOpaqueSingle.testSwitch 25 150 N/A
>> avgt 30 4.105 � 0.019 ms/op
>> MethodHandlesTableSwitchOpaqueSingle.testSwitch 50 0 N/A
>> avgt 30 4.097 � 0.021 ms/op
>> MethodHandlesTableSwitchOpaqueSingle.testSwitch 50 150 N/A
>> avgt 30 4.131 � 0.037 ms/op
>> MethodHandlesTableSwitchOpaqueSingle.testSwitch 100 0 N/A
>> avgt 30 4.135 � 0.025 ms/op
>> MethodHandlesTableSwitchOpaqueSingle.testSwitch 100 150 N/A
>> avgt 30 4.139 � 0.145 ms/op
>> MethodHandlesTableSwitchRandom.testSwitch 5 0 true
>> avgt 30 4.894 � 0.028 ms/op
>> MethodHandlesTableSwitchRandom.testSwitch 5 0 false
>> avgt 30 11.526 � 0.194 ms/op
>> MethodHandlesTableSwitchRandom.testSwitch 5 150 true
>> avgt 30 4.882 � 0.025 ms/op
>> MethodHandlesTableSwitchRandom.testSwitch 5 150 false
>> avgt 30 11.532 � 0.034 ms/op
>> MethodHandlesTableSwitchRandom.testSwitch 10 0 true
>> avgt 30 5.065 � 0.076 ms/op
>> MethodHandlesTableSwitchRandom.testSwitch 10 0 false
>> avgt 30 13.016 � 0.020 ms/op
>> MethodHandlesTableSwitchRandom.testSwitch 10 150 true
>> avgt 30 5.103 � 0.051 ms/op
>> MethodHandlesTableSwitchRandom.testSwitch 10 150 false
>> avgt 30 12.984 � 0.102 ms/op
>> MethodHandlesTableSwitchRandom.testSwitch 25 0 true
>> avgt 30 8.441 � 0.165 ms/op
>> MethodHandlesTableSwitchRandom.testSwitch 25 0 false
>> avgt 30 13.371 � 0.060 ms/op
>> MethodHandlesTableSwitchRandom.testSwitch 25 150 true
>> avgt 30 8.628 � 0.032 ms/op
>> MethodHandlesTableSwitchRandom.testSwitch 25 150 false
>> avgt 30 13.542 � 0.020 ms/op
>> MethodHandlesTableSwitchRandom.testSwitch 50 0 true
>> avgt 30 4.701 � 0.015 ms/op
>> MethodHandlesTableSwitchRandom.testSwitch 50 0 false
>> avgt 30 13.562 � 0.063 ms/op
>> MethodHandlesTableSwitchRandom.testSwitch 50 150 true
>> avgt 30 7.991 � 3.111 ms/op
>> MethodHandlesTableSwitchRandom.testSwitch 50 150 false
>> avgt 30 13.543 � 0.088 ms/op
>> MethodHandlesTableSwitchRandom.testSwitch 100 0 true
>> avgt 30 4.712 � 0.020 ms/op
>> MethodHandlesTableSwitchRandom.testSwitch 100 0 false
>> avgt 30 13.600 � 0.085 ms/op
>> MethodHandlesTableSwitchRandom.testSwitch 100 150 true
>> avgt 30 4.676 � 0.011 ms/op
>> MethodHandlesTableSwitchRandom.testSwitch 100 150 false
>> avgt 30 13.476 � 0.043 ms/op
>>
>>
>> Testing:
>> - [x] Running of included benchmarks
>> - [x] Inspecting inlining trace and verifying method handle targets are inlined
>> - [x] Running TestTableSwitch test (currently the only user of the new code)
>> - [x] Running java/lang/invoke tests (just in case)
>> - [x] Some manual testing
>>
>> Thanks,
>> Jorn
>
>> you have two strategy to de-sugar a switch,
>>
>> * if what you do after the case do not mutate any variables, you can desugar
>> each case to a method more or less like a lambda (it's not exactly like a
>> lambda because there is no capture) and you have an indy in front that will
>> call the right method handles
>>
>> * you have a front end, with an indy that associate the object to an int and a
>> backend which is tableswitch in the bytecode
>>
>> ...
>>
>> The tests above are using the first strategy
>
> No, they are using the second strategy. The SwitchBootstraps patch I linked to
> replaces the front end `lookupswitch` of a String switch with an
> `invokedynamic` that computes an index for the back end jump table, which is
> still a `tableswitch` in the bytecode.
>
> As John also described, a hypothetical lookupSwitch combinator can be emulated
> by using a `k -> [0, N)` projection that feeds into the tableSwitch combinator
> that is proposed by this PR. The point of the examples I linked to was to show
> several flavors of projection functions as an example of how this could be
> implemented, and to show that they have competitive performance with a native
> `lookupswitch` instruction (the 'legacy' case). i.e. the benchmarks show the
> difference between `lookupswitch` implemented in bytecode, and a `k -> [0, N)`
> projection function built by an `invokedynamic`. (sorry, I should have offered
> more explanation in the first place)
>
> The combinator added by _this_ PR is not meant to replace any part of the String
> switch translation. For pattern switch the `tableSwitch` combinator _could_ be
> used to implement the front end `k -> [0, N)` projection, but it is not
> strictly required. Either way, that seems orthogonal to this PR.
I agree this is orthogonal and we can continue that discussion without blocking this PR.
About your benchmark, did you test with some strings going into "default", because it is usually in that case that you need a proper lookup switch,
another way to say it is that, your results are too good when you use a cascade of guardWithTest.
>
> -------------
>
> PR: https://git.openjdk.java.net/jdk/pull/3401
Rémi
More information about the core-libs-dev
mailing list