[foreign] RFR 8224481: Optimize struct getter and field getter paths.
Jorn Vernee
jbvernee at xs4all.nl
Tue May 21 13:41:05 UTC 2019
Hi,
After the recent string of benchmarking [1], I've arrived at 2
optimizations to improve the speed of the measured code path.
1.) Specialization of Struct getter MethodHandles per struct class.
2.) Implementation of RuntimeSupport::casterImpl that does a fused cast
and offset operation, to avoid creating multiple Pointer objects.
The benchmark:
http://cr.openjdk.java.net/~jvernee/panama/webrevs/8224481/bench/webrev.00/
The optimizations:
http://cr.openjdk.java.net/~jvernee/panama/webrevs/8224481/opto/webrev.00/
I've split these into 2 so that it's easier to run the benchmarks with
and without the optimizations. (benchmark uses the OpenJDK's builtin
framework [2]).
Since we're now more eagerly instantiating the struct impl class I had
to work around partial struct types, since spinning the impl requires a
non-partial type and now we're spinning the impl when creating the
LayouType for the struct, as opposed to on the first dereference. To do
this I'm detecting whether the struct is partial in LayoutType.ofStruct,
and using a Reference.OfGrumpy in the case where it can not be resolved.
Tbh, I think this makes things a little more clear as well as far as
where/how the exception for deref of a partial type is thrown.
Results on my machine before the optimization are:
Benchmark Mode Cnt Score Error Units
GetStruct.jni_baseline avgt 50 14.204 ▒ 0.566 ns/op
GetStruct.panama_get_both avgt 50 507.638 ▒ 19.462 ns/op
GetStruct.panama_get_fieldonly avgt 50 90.236 ▒ 11.027 ns/op
GetStruct.panama_get_structonly avgt 50 370.783 ▒ 13.744 ns/op
And after:
Benchmark Mode Cnt Score Error Units
GetStruct.jni_baseline avgt 50 13.941 ▒ 0.485 ns/op
GetStruct.panama_get_both avgt 50 41.199 ▒ 1.632 ns/op
GetStruct.panama_get_fieldonly avgt 50 33.432 ▒ 1.889 ns/op
GetStruct.panama_get_structonly avgt 50 13.469 ▒ 0.781 ns/op
Where panama_get_structonly corresponds to 1., and panama_get_fieldonly
corresponds to 2. For a total of about 12x speedup.
Thanks,
Jorn
[1] :
https://mail.openjdk.java.net/pipermail/panama-dev/2019-May/005469.html
[2] : https://openjdk.java.net/jeps/230
More information about the panama-dev
mailing list