RFR: 8264288: Performance issue with MethodHandle.asCollector
Jorn Vernee
jvernee at openjdk.java.net
Thu Apr 1 17:01:51 UTC 2021
This patch speeds up MethodHandle.asCollector handles where the array type is not Object[], as well as speeding up all collectors where the arity is greater than 10.
The old code is creating a collector handle by combining a set of hard coded methods for collecting arguments into an Object[], up to a maximum of ten elements, and then copying this intermediate array into a final array.
In principle, it shouldn't matter how slow (or fast) this handle is, because it should be replaced by existing bytecode intrinsification which does the right thing. But, through investigation it turns out that the intrinsification is only being applied in a very limited amount of cases: Object[] with max 10 elements only, only for the intermediate collector handles. Every other collector shape incurs overhead because it essentially ends up calling the ineffecient fallback handle.
Rather than sticking on a band aid, the new code replaces the existing implementation with a collector handle implemented using a LambdaForm, which removes the need for intrinsification, and also greatly reduces code-complexity of the implementation. (I plan to expose this collector using a public API in the future as well, so users don't have to go through MHs::identity to make a collector).
The old code was also using a special lambda form transform for collecting arguments into an array. I believe this was done to take advantage of the existing-but-limited bytecode intrinsification, at least for Object[] with less than 10 elements.
The new code just uses the existing collect arguments transform with the newly added collector handle as filter, and this works just as well for the existing case, but as a bonus is also much simpler, since no separate transform is needed. Using the collect arguments transform should also improve sharing.
As a result of these changes a lot of code was unused and has been removed in this patch.
Testing: tier 1-3, benchmarking using TypedAsCollector (part of the patch), as well as another variant of the benchmark that used a declared static identity method instead of MHs::identity (not included). Before/after comparison of MethodHandleAs* benchmarks (no differences there).
Here are some numbers from the added benchmark:
Before:
Benchmark Mode Cnt Score Error Units
TypedAsCollector.testIntCollect avgt 30 189.156 � 1.796 ns/op
TypedAsCollector.testIntCollectHighArity avgt 30 660.549 � 10.159 ns/op
TypedAsCollector.testObjectCollect avgt 30 7.092 � 0.042 ns/op
TypedAsCollector.testObjectCollectHighArity avgt 30 65.225 � 0.546 ns/op
TypedAsCollector.testStringCollect avgt 30 28.511 � 0.243 ns/op
TypedAsCollector.testStringCollectHighArity avgt 30 57.054 � 0.635 ns/op
(as you can see, just the Object[] with arity less than 10 case is fast here)
After:
Benchmark Mode Cnt Score Error Units
TypedAsCollector.testIntCollect avgt 30 6.569 � 0.131 ns/op
TypedAsCollector.testIntCollectHighArity avgt 30 8.923 � 0.066 ns/op
TypedAsCollector.testObjectCollect avgt 30 6.813 � 0.035 ns/op
TypedAsCollector.testObjectCollectHighArity avgt 30 9.718 � 0.066 ns/op
TypedAsCollector.testStringCollect avgt 30 6.737 � 0.016 ns/op
TypedAsCollector.testStringCollectHighArity avgt 30 9.618 � 0.052 ns/op
Thanks,
Jorn
-------------
Commit messages:
- WIP - clean up
- Completely implement collect in lambdaform
- WIP - fix MH collector perf, with jumbo arity bandaid
Changes: https://git.openjdk.java.net/jdk/pull/3306/files
Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3306&range=00
Issue: https://bugs.openjdk.java.net/browse/JDK-8264288
Stats: 513 lines in 8 files changed: 154 ins; 337 del; 22 mod
Patch: https://git.openjdk.java.net/jdk/pull/3306.diff
Fetch: git fetch https://git.openjdk.java.net/jdk pull/3306/head:pull/3306
PR: https://git.openjdk.java.net/jdk/pull/3306
More information about the core-libs-dev
mailing list