[foreign] RFR 8224481: Optimize struct getter and field getter paths.

Jorn Vernee jbvernee at xs4all.nl
Tue May 21 13:41:05 UTC 2019


Hi,

After the recent string of benchmarking [1], I've arrived at 2 
optimizations to improve the speed of the measured code path.

1.) Specialization of Struct getter MethodHandles per struct class.
2.) Implementation of RuntimeSupport::casterImpl that does a fused cast 
and offset operation, to avoid creating multiple Pointer objects.

The benchmark: 
http://cr.openjdk.java.net/~jvernee/panama/webrevs/8224481/bench/webrev.00/
The optimizations: 
http://cr.openjdk.java.net/~jvernee/panama/webrevs/8224481/opto/webrev.00/

I've split these into 2 so that it's easier to run the benchmarks with 
and without the optimizations. (benchmark uses the OpenJDK's builtin 
framework [2]).

Since we're now more eagerly instantiating the struct impl class I had 
to work around partial struct types, since spinning the impl requires a 
non-partial type and now we're spinning the impl when creating the 
LayouType for the struct, as opposed to on the first dereference. To do 
this I'm detecting whether the struct is partial in LayoutType.ofStruct, 
and using a Reference.OfGrumpy in the case where it can not be resolved. 
Tbh, I think this makes things a little more clear as well as far as 
where/how the exception for deref of a partial type is thrown.

Results on my machine before the optimization are:

Benchmark                        Mode  Cnt    Score    Error  Units
GetStruct.jni_baseline           avgt   50   14.204 ▒  0.566  ns/op
GetStruct.panama_get_both        avgt   50  507.638 ▒ 19.462  ns/op
GetStruct.panama_get_fieldonly   avgt   50   90.236 ▒ 11.027  ns/op
GetStruct.panama_get_structonly  avgt   50  370.783 ▒ 13.744  ns/op

And after:

Benchmark                        Mode  Cnt   Score   Error  Units
GetStruct.jni_baseline           avgt   50  13.941 ▒ 0.485  ns/op
GetStruct.panama_get_both        avgt   50  41.199 ▒ 1.632  ns/op
GetStruct.panama_get_fieldonly   avgt   50  33.432 ▒ 1.889  ns/op
GetStruct.panama_get_structonly  avgt   50  13.469 ▒ 0.781  ns/op

Where panama_get_structonly corresponds to 1., and panama_get_fieldonly 
corresponds to 2. For a total of about 12x speedup.

Thanks,
Jorn

[1] : 
https://mail.openjdk.java.net/pipermail/panama-dev/2019-May/005469.html
[2] : https://openjdk.java.net/jeps/230


More information about the panama-dev mailing list