[foreign] RFR 8224481: Optimize struct getter and field getter paths.

Maurizio Cimadamore maurizio.cimadamore at oracle.com
Wed May 22 10:37:19 UTC 2019


Looks good - yesterday I was looking at this discussion:

http://mail.openjdk.java.net/pipermail/mlvm-dev/2016-January/006563.html

I hope we don't run in the condition described there - e.g. that there's 
no strong reachability from the MH we're caching back to the static 
ClassValue instance - because, if that would be the case I think that 
would prevent class unloading.

The problem is that the MethodHandle we cache refers to the stuct impl 
class, and I believe that class refers to some LayoutTypes on its own, 
which have a Reference inside, so it would be:

ClassValue<MH> -> MH -> StructImpl -> LayoutType -> Reference -> 
ClassValue<MH>

Sundar can you double check?

Maurizio

On 22/05/2019 10:56, Jorn Vernee wrote:
> Good suggestion! This solves the problem, is nice and simple, and 
> keeps the same times in the benchmark.
>
> Updated webrev: 
> http://cr.openjdk.java.net/~jvernee/panama/webrevs/8224481/opto/webrev.02/
>
> (only changes to References.java)
>
> I've added a test for the failure. I think that can be included as 
> well? I re-ran the samples I have as well, and this time it's all green.
>
> Thanks,
> Jorn
>
> Maurizio Cimadamore schreef op 2019-05-22 01:15:
>> On 21/05/2019 20:16, Jorn Vernee wrote:
>>> Although, now that you bring it up, I tried re-running some of the 
>>> samples (hadn't done that yet), and I'm seeing some infinite 
>>> recursion. This is seemingly caused by a circular type reference 
>>> (e.g. linked list). i.e. to spin the impl of an accessor we need the 
>>> LayoutType of the struct itself, which then tries to spin the impl 
>>> again, and so on. I guess this isn't a test case in our suite yet...
>>>
>>> I'll look into this.
>>
>> Good detective work! I guess it would make sense to try and reduce it
>> down to a simpler test, and push the test first.
>>
>> Where I was going with this is - your patch effectively made the lazy
>> resolution inside StructImplGenerator useless. If we really want to
>> explore that option, then we should, I think, remove all lazy
>> resolution sites and see what happens. It is possible that we don't
>> rely so much on laziness as we did in the past (we did some fixes few
>> months ago which stabilized resolution quite a bit) - in which case we
>> can remove the resolution requests, although - I have to admit - I'm a
>> bit skeptical. After all all you need it something like this (as you
>> say):
>>
>> struct foo {
>>     struct foo *next;
>> }
>>
>> Which is kind of the killer app for unresolved layouts in the first 
>> place.
>>
>> This is translated into a struct interface which has a getter of
>> Pointer<foo>. To generate the getter you need to compute its
>> LayoutType which is a pointer LayoutType, so you have to compute the
>> pointee LayoutType which brings you back where you started (the whole
>> 'foo' LayoutType). In other words, since now the creation of
>> LayoutType<foo> requires the generation of the struct impl for 'foo'
>> and since that depends (indirectly, through the pointer getter) on
>> being able to produce a LayoutType<foo>, you get a circularity.
>>
>> One thing we could try is - instead of eagerly creating the struct
>> impl, why don't we let the Reference.OfStruct having some mutable
>> state in it? That is, we could start off with Reference getter which
>> does the expensive refelective lookup - but then, once it has
>> discovered the constructor MH, it can stash it in some field (which is
>> private to that reference object) and use it later if the getter is
>> used again. Then, you probably still need a ClassValue to stash a
>> mapping between a Class and its Reference.OfStruct; but it seems like
>> this could fit in more naturally?
>>
>> Maurizio
>>
>>>
>>> Thanks,
>>> Jorn
>>>
>>> Jorn Vernee schreef op 2019-05-21 21:06:
>>>> Since we have the resolution context for NativeHeader, AFAIK there is
>>>> no more difference between the resolution call done by
>>>> StructImpleGenerator, and the one done by LayoutTypeImpl.ofStruct. So
>>>> I don't think there are any more cases where we would have succeeded
>>>> to resolve the Struct layout be delaying spinning the impl. At least
>>>> the tests haven't caught such a case.
>>>>
>>>> The other thing is that the partial layout for the getter is caught in
>>>> StructImplGenerator, but for the setter it's caught when calling
>>>> bitSize on Unresolved. Saying layouts should be able to be resolved
>>>> when calling LayoutType.ofStruct means we can use References.OfGrumpy,
>>>> which makes the two more uniform.
>>>>
>>>> I have some ideas for keeping the lazy init semantics, but it's a bit
>>>> more complex (using a MutableCallSite to mimic indy), and I'm not sure
>>>> it will work as well.
>>>>
>>>> And, well, there was some talk about eagerly spinning the
>>>> implementations any ways :)
>>>>
>>>> Jorn
>>>>
>>>> Maurizio Cimadamore schreef op 2019-05-21 20:09:
>>>>> Looks good, although I'm a bit worried about the change in semantics
>>>>> w.r.t. eager instantiation. The binder will create a lot of
>>>>> LayoutTypes when generating the implementation - I wonder there were
>>>>> cases before where we created a partial layout type, which then got
>>>>> resolved correctly by the time it was dereferenced (since we do
>>>>> another resolve lazily in StructImplGenerator [1]).
>>>>>
>>>>> [1] -
>>>>> http://hg.openjdk.java.net/panama/dev/file/5ea3089be5ac/src/java.base/share/classes/jdk/internal/foreign/StructImplGenerator.java#l52 
>>>>> On 21/05/2019 14:41, Jorn Vernee wrote:
>>>>>> Hi,
>>>>>>
>>>>>> After the recent string of benchmarking [1], I've arrived at 2 
>>>>>> optimizations to improve the speed of the measured code path.
>>>>>>
>>>>>> 1.) Specialization of Struct getter MethodHandles per struct class.
>>>>>> 2.) Implementation of RuntimeSupport::casterImpl that does a 
>>>>>> fused cast and offset operation, to avoid creating multiple 
>>>>>> Pointer objects.
>>>>>>
>>>>>> The benchmark: 
>>>>>> http://cr.openjdk.java.net/~jvernee/panama/webrevs/8224481/bench/webrev.00/
>>>>>> The optimizations: 
>>>>>> http://cr.openjdk.java.net/~jvernee/panama/webrevs/8224481/opto/webrev.00/
>>>>>>
>>>>>> I've split these into 2 so that it's easier to run the benchmarks 
>>>>>> with and without the optimizations. (benchmark uses the OpenJDK's 
>>>>>> builtin framework [2]).
>>>>>>
>>>>>> Since we're now more eagerly instantiating the struct impl class 
>>>>>> I had to work around partial struct types, since spinning the 
>>>>>> impl requires a non-partial type and now we're spinning the impl 
>>>>>> when creating the LayouType for the struct, as opposed to on the 
>>>>>> first dereference. To do this I'm detecting whether the struct is 
>>>>>> partial in LayoutType.ofStruct, and using a Reference.OfGrumpy in 
>>>>>> the case where it can not be resolved. Tbh, I think this makes 
>>>>>> things a little more clear as well as far as where/how the 
>>>>>> exception for deref of a partial type is thrown.
>>>>>>
>>>>>> Results on my machine before the optimization are:
>>>>>>
>>>>>> Benchmark                        Mode  Cnt    Score Error Units
>>>>>> GetStruct.jni_baseline           avgt   50   14.204 ▒ 0.566 ns/op
>>>>>> GetStruct.panama_get_both        avgt   50  507.638 ▒ 19.462 ns/op
>>>>>> GetStruct.panama_get_fieldonly   avgt   50   90.236 ▒ 11.027 ns/op
>>>>>> GetStruct.panama_get_structonly  avgt   50  370.783 ▒ 13.744 ns/op
>>>>>>
>>>>>> And after:
>>>>>>
>>>>>> Benchmark                        Mode  Cnt   Score Error Units
>>>>>> GetStruct.jni_baseline           avgt   50  13.941 ▒ 0.485 ns/op
>>>>>> GetStruct.panama_get_both        avgt   50  41.199 ▒ 1.632 ns/op
>>>>>> GetStruct.panama_get_fieldonly   avgt   50  33.432 ▒ 1.889 ns/op
>>>>>> GetStruct.panama_get_structonly  avgt   50  13.469 ▒ 0.781 ns/op
>>>>>>
>>>>>> Where panama_get_structonly corresponds to 1., and 
>>>>>> panama_get_fieldonly corresponds to 2. For a total of about 12x 
>>>>>> speedup.
>>>>>>
>>>>>> Thanks,
>>>>>> Jorn
>>>>>>
>>>>>> [1] : 
>>>>>> https://mail.openjdk.java.net/pipermail/panama-dev/2019-May/005469.html 
>>>>>> [2] : https://openjdk.java.net/jeps/230


More information about the panama-dev mailing list