RFR: 8263087: Add a MethodHandle combinator that switches over a set of MethodHandles

Remi Forax forax at univ-mlv.fr
Tue Apr 13 21:58:59 UTC 2021


----- Mail original -----
> De: "Jorn Vernee" <jvernee at openjdk.java.net>
> À: "core-libs-dev" <core-libs-dev at openjdk.java.net>
> Envoyé: Mardi 13 Avril 2021 16:59:58
> Objet: Re: RFR: 8263087: Add a MethodHandle combinator that switches over a set of MethodHandles

> On Thu, 8 Apr 2021 18:51:21 GMT, Jorn Vernee <jvernee at openjdk.org> wrote:
> 
>> This patch adds a `tableSwitch` combinator that can be used to switch over a set
>> of method handles given an index, with a fallback in case the index is out of
>> bounds, much like the `tableswitch` bytecode. Here is a description of how it
>> works (copied from the javadoc):
>> 
>>      Creates a table switch method handle, which can be used to switch over a set of
>>      target
>>      method handles, based on a given target index, called selector.
>> 
>>      For a selector value of {@code n}, where {@code n} falls in the range {@code [0,
>>      N)},
>>      and where {@code N} is the number of target method handles, the table switch
>>      method
>>      handle will invoke the n-th target method handle from the list of target method
>>      handles.
>> 
>>      For a selector value that does not fall in the range {@code [0, N)}, the table
>>      switch
>>      method handle will invoke the given fallback method handle.
>> 
>>      All method handles passed to this method must have the same type, with the
>>      additional
>>      requirement that the leading parameter be of type {@code int}. The leading
>>      parameter
>>      represents the selector.
>> 
>>      Any trailing parameters present in the type will appear on the returned table
>>      switch
>>      method handle as well. Any arguments assigned to these parameters will be
>>      forwarded,
>>      together with the selector value, to the selected method handle when invoking
>>      it.
>> 
>> The combinator does not support specifying the starting index, so the switch
>> cases always run from 0 to however many target handles are specified. A
>> starting index can be added manually with another combination step that filters
>> the input index by adding or subtracting a constant from it, which does not
>> affect performance. One of the reasons for not supporting a starting index is
>> that it allows for more lambda form sharing, but also simplifies the
>> implementation somewhat. I guess an open question is if a convenience overload
>> should be added for that case?
>> 
>> Lookup switch can also be simulated by filtering the input through an injection
>> function that translates it into a case index, which has also proven to have
>> the ability to have comparable performance to, or even better performance than,
>> a bytecode-native `lookupswitch` instruction. I plan to add such an injection
>> function to the runtime libraries in the future as well. Maybe at that point it
>> could be evaluated if it's worth it to add a lookup switch combinator as well,
>> but I don't see an immediate need to include it in this patch.
>> 
>> The current bytecode intrinsification generates a call for each switch case,
>> which guarantees full inlining of the target method handles. Alternatively we
>> could only have 1 callsite at the end of the switch, where each case just loads
>> the target method handle, but currently this does not allow for inlining of the
>> handles, since they are not constant.
>> 
>> Maybe a future C2 optimization could look at the receiver input for invokeBasic
>> call sites, and if the input is a phi node, clone the call for each constant
>> input of the phi. I believe that would allow simplifying the bytecode without
>> giving up on inlining.
>> 
>> Some numbers from the added benchmarks:
>> 
>> Benchmark                                        (numCases)  (offset)  (sorted)
>> Mode  Cnt   Score   Error  Units
>> MethodHandlesTableSwitchConstant.testSwitch               5         0       N/A
>> avgt   30   4.186 � 0.054  ms/op
>> MethodHandlesTableSwitchConstant.testSwitch               5       150       N/A
>> avgt   30   4.164 � 0.057  ms/op
>> MethodHandlesTableSwitchConstant.testSwitch              10         0       N/A
>> avgt   30   4.124 � 0.023  ms/op
>> MethodHandlesTableSwitchConstant.testSwitch              10       150       N/A
>> avgt   30   4.126 � 0.025  ms/op
>> MethodHandlesTableSwitchConstant.testSwitch              25         0       N/A
>> avgt   30   4.137 � 0.042  ms/op
>> MethodHandlesTableSwitchConstant.testSwitch              25       150       N/A
>> avgt   30   4.113 � 0.016  ms/op
>> MethodHandlesTableSwitchConstant.testSwitch              50         0       N/A
>> avgt   30   4.118 � 0.028  ms/op
>> MethodHandlesTableSwitchConstant.testSwitch              50       150       N/A
>> avgt   30   4.127 � 0.019  ms/op
>> MethodHandlesTableSwitchConstant.testSwitch             100         0       N/A
>> avgt   30   4.116 � 0.013  ms/op
>> MethodHandlesTableSwitchConstant.testSwitch             100       150       N/A
>> avgt   30   4.121 � 0.020  ms/op
>> MethodHandlesTableSwitchOpaqueSingle.testSwitch           5         0       N/A
>> avgt   30   4.113 � 0.009  ms/op
>> MethodHandlesTableSwitchOpaqueSingle.testSwitch           5       150       N/A
>> avgt   30   4.149 � 0.041  ms/op
>> MethodHandlesTableSwitchOpaqueSingle.testSwitch          10         0       N/A
>> avgt   30   4.121 � 0.026  ms/op
>> MethodHandlesTableSwitchOpaqueSingle.testSwitch          10       150       N/A
>> avgt   30   4.113 � 0.021  ms/op
>> MethodHandlesTableSwitchOpaqueSingle.testSwitch          25         0       N/A
>> avgt   30   4.129 � 0.028  ms/op
>> MethodHandlesTableSwitchOpaqueSingle.testSwitch          25       150       N/A
>> avgt   30   4.105 � 0.019  ms/op
>> MethodHandlesTableSwitchOpaqueSingle.testSwitch          50         0       N/A
>> avgt   30   4.097 � 0.021  ms/op
>> MethodHandlesTableSwitchOpaqueSingle.testSwitch          50       150       N/A
>> avgt   30   4.131 � 0.037  ms/op
>> MethodHandlesTableSwitchOpaqueSingle.testSwitch         100         0       N/A
>> avgt   30   4.135 � 0.025  ms/op
>> MethodHandlesTableSwitchOpaqueSingle.testSwitch         100       150       N/A
>> avgt   30   4.139 � 0.145  ms/op
>> MethodHandlesTableSwitchRandom.testSwitch                 5         0      true
>> avgt   30   4.894 � 0.028  ms/op
>> MethodHandlesTableSwitchRandom.testSwitch                 5         0     false
>> avgt   30  11.526 � 0.194  ms/op
>> MethodHandlesTableSwitchRandom.testSwitch                 5       150      true
>> avgt   30   4.882 � 0.025  ms/op
>> MethodHandlesTableSwitchRandom.testSwitch                 5       150     false
>> avgt   30  11.532 � 0.034  ms/op
>> MethodHandlesTableSwitchRandom.testSwitch                10         0      true
>> avgt   30   5.065 � 0.076  ms/op
>> MethodHandlesTableSwitchRandom.testSwitch                10         0     false
>> avgt   30  13.016 � 0.020  ms/op
>> MethodHandlesTableSwitchRandom.testSwitch                10       150      true
>> avgt   30   5.103 � 0.051  ms/op
>> MethodHandlesTableSwitchRandom.testSwitch                10       150     false
>> avgt   30  12.984 � 0.102  ms/op
>> MethodHandlesTableSwitchRandom.testSwitch                25         0      true
>> avgt   30   8.441 � 0.165  ms/op
>> MethodHandlesTableSwitchRandom.testSwitch                25         0     false
>> avgt   30  13.371 � 0.060  ms/op
>> MethodHandlesTableSwitchRandom.testSwitch                25       150      true
>> avgt   30   8.628 � 0.032  ms/op
>> MethodHandlesTableSwitchRandom.testSwitch                25       150     false
>> avgt   30  13.542 � 0.020  ms/op
>> MethodHandlesTableSwitchRandom.testSwitch                50         0      true
>> avgt   30   4.701 � 0.015  ms/op
>> MethodHandlesTableSwitchRandom.testSwitch                50         0     false
>> avgt   30  13.562 � 0.063  ms/op
>> MethodHandlesTableSwitchRandom.testSwitch                50       150      true
>> avgt   30   7.991 � 3.111  ms/op
>> MethodHandlesTableSwitchRandom.testSwitch                50       150     false
>> avgt   30  13.543 � 0.088  ms/op
>> MethodHandlesTableSwitchRandom.testSwitch               100         0      true
>> avgt   30   4.712 � 0.020  ms/op
>> MethodHandlesTableSwitchRandom.testSwitch               100         0     false
>> avgt   30  13.600 � 0.085  ms/op
>> MethodHandlesTableSwitchRandom.testSwitch               100       150      true
>> avgt   30   4.676 � 0.011  ms/op
>> MethodHandlesTableSwitchRandom.testSwitch               100       150     false
>> avgt   30  13.476 � 0.043  ms/op
>> 
>> 
>> Testing:
>> - [x] Running of included benchmarks
>> - [x] Inspecting inlining trace and verifying method handle targets are inlined
>> - [x] Running TestTableSwitch test (currently the only user of the new code)
>> - [x] Running java/lang/invoke tests (just in case)
>> - [x] Some manual testing
>> 
>> Thanks,
>> Jorn
> 
>> you have two strategy to de-sugar a switch,
>> 
>> * if what you do after the case do not mutate any variables, you can desugar
>> each case to a method more or less like a lambda (it's not exactly like a
>> lambda because there is no capture) and you have an indy in front that will
>> call the right method handles
>> 
>> * you have a front end, with an indy that associate the object to an int and a
>> backend which is tableswitch in the bytecode
>> 
>> ...
>>
>> The tests above are using the first strategy
> 
> No, they are using the second strategy. The SwitchBootstraps patch I linked to
> replaces the front end `lookupswitch` of a String switch with an
> `invokedynamic` that computes an index for the back end jump table, which is
> still a `tableswitch` in the bytecode.
> 
> As John also described, a hypothetical lookupSwitch combinator can be emulated
> by using a `k -> [0, N)` projection that feeds into the tableSwitch combinator
> that is proposed by this PR. The point of the examples I linked to was to show
> several flavors of projection functions as an example of how this could be
> implemented, and to show that they have competitive performance with a native
> `lookupswitch` instruction (the 'legacy' case). i.e. the benchmarks show the
> difference between `lookupswitch` implemented in bytecode, and a `k -> [0, N)`
> projection function built by an `invokedynamic`. (sorry, I should have offered
> more explanation in the first place)
> 
> The combinator added by _this_ PR is not meant to replace any part of the String
> switch translation. For pattern switch the `tableSwitch` combinator _could_ be
> used to implement the front end `k -> [0, N)` projection, but it is not
> strictly required. Either way, that seems orthogonal to this PR.

I agree this is orthogonal and we can continue that discussion without blocking this PR.

About your benchmark, did you test with some strings going into "default", because it is usually in that case that you need a proper lookup switch,
another way to say it is that, your results are too good when you use a cascade of guardWithTest.


> 
> -------------
> 
> PR: https://git.openjdk.java.net/jdk/pull/3401

Rémi


More information about the core-libs-dev mailing list