loop customization: a key challenge

Fri Sep 14 00:05:24 PDT 2012

On 09/14/2012 06:22 AM, John Rose wrote:
> On Sep 11, 2012, at 2:09 AM, Aleksey Shipilev wrote:
>
>> On 09/10/2012 11:13 PM, John Rose wrote:
>>> The methods strongly hint to implementors and users that bind and
>>> findVirtual + bindTo perform the obvious devirtualization.
>> I haven't been following jsr292 development recently. Is that kind of
>> hint already favored by Hotspot? I would like to try to do this
>> conversion by hand and see if it helps some of our benchmarks here.
> Yes, findVirtual followed by bindTo routinely devirtualizes the handle.
> The JDK 8 version of this logic is DirectMethodHandle.maybeRebind,
> a private method that replaces a "virtual" or "interface" reference by a "special" one.
>
>>> Of course, loop customization does not really appear to be a case of
>>> devirtualization… unless perhaps you treat each loop superstructure as a
>>> defender method on the loop kernel interface.  Then L1_X is really X.L1,
>>> and the JVM optimization framework can swing into action, cloning L1
>>> into each receiver type X.  So that's a MH-free way to think about it.
>> Yes, it appears that your suggestion applies to whatever
>> "superstructure" there is in the code. I would like to highlight that
>> the obvious demarcation point for such the superstructure is the method
>> call, and we can probably spare some of the syntactical pain and
>> hard-core conspiracy from the library developers. I.e. if there is a way
>> to say
>>
>> class Arrays {
>> void <T> apply(T[] array, @PleaseSpecialize UnaryOperator<T> op) {
>>      // blah-blah, apply
>> }
>> ...
>> }
>>
>> ...albeit being more limiting in "superstructure" sense, it is more
>> clear than explicitly writing up jsr292 magics. Of course, in this
>> sense, we can even try to desugar this to jsr292 with the conversion
>> outlined by John, e.g. into:
>>
>> class Arrays {
>> void <T> apply(T[] array, @PleaseSpecialize UnaryOperator<T> op) {
>>     MH superOp = #apply$$Internal.bindTo(op);
>>     superOp.invoke(array);
>> }
>> void apply$$Internal(UnaryOperator<T> op, T[] array) {
>>      // blah-blah, apply
>> }
>> }
> Method calling is *not* the natural place to do this specialization,
> and this is true even though (in a different sense) almost everything
> is done with method calls.  In the example, putting the bindTo
> operation next to the invoke operation forces it to happen
> just before each bulk request.  This is OK for a one-shot API,
> but is probably not general enough to build up big frameworks
> like fork-join.
>
> The reason for this is that combining a superstructure with a kernel
> is logically distinct from executing the combined result, and in general
> needs to be specified separately from the execution.
>
> So we need a notation for a "combination request" which is distinct
> from and prior to the "invocation request".
>
> Representing this combination request via an annotation on a named
> method obliges designers to give a name to every point where
> the combination is requested.  This is unnatural in about the same
> way as requiring every "break" or "continue" to have a label.
> (Or, every closure.)
>
> The *action* of combining a superstructure (I want a better name here!)
> with a kernel (this is a standard name) should of course be expressed as
> an operator or method.  This operator or method must be kept distinct
> from the action of executing the combination.  It could be MH::bindTo,
> or (probably better) it should be something more explicit.
>
> Alternatively, instead of an operator, it could be some kind of closure
> expression, one which makes it clear what are the roles of superstructure
> and kernel.  The compiler and runtime would manage the caching of
> combined forms, at the point where the closure expression was implemented.
>
> (As an alternative, the combination can be made automagic, as an implicit
> preparation to invocation.  That's basically what an inlining JIT compiler does.
> But my big point in all of this is that the user and/or library writer probably needs
> to help the system with hints about the various aspects of combination.)
>
> I hope this helps.  I realize it is fairly vague.  The problem is completely real, though.
>
> — John

Here is a modest benchmark of different ways to implement combination of 
map/reduce/forEach etc
for lambdas using iterators, Brian's pipeline, Rich Hickey's combiners 
and method handles.
   https://github.com/forax/lambda-perf

I'm not a pro of writing benchmark so this one is may be flawed and 
there is no point to try to use it
with jdk7 or jdk8 before b56 because the method handles tests will fall 
into the well know perf hole
that was fixed recently.

As John said, the combination is fully explicit and there is no caching 
at all, a to to reduce or forEach in a loop
will create the method handle blob again and again.

John, Christian, when asking for  the assembly code, I was not able to 
find a version that combine
the whole method handle blob as a single method as this is done when 
invokedynamic is used,
is it the way it's supposed to work or did it miss something ?

cheers,
Rémi