RFR: 8213478: Reduce rebinds when applying repeated filters and conversions
Claes Redestad
claes.redestad at oracle.com
Wed Nov 7 14:57:33 UTC 2018
Hi,
by providing a LambdaFormEditor transform which applies the same filter
repeatedly on some selected arguments, we can optimize for cases where
there are repeated conversions. This allows the runtime to optimize the
internal setup of public API methods such as
MethodHandles.filterArguments(MethodHandle, int, MethodHandle...) and
MethodHandle.asType(MethodType)
Webrev: http://cr.openjdk.java.net/~redestad/8213478/jdk.00/
Bug: https://bugs.openjdk.java.net/browse/JDK-8213478
Example: mh.asType(mt1) where the mh.type is (JJJJ)I and mt1 is (IIII)I
would need to add I->J widening conversions to each argument. Each
application of such a conversion C creates a new BMH with a new species,
first (JIII)I, with C bound in as a 5th (or 6th, counting the MH)
argument, then from that we create (JJII)I with C bound in again as the
6th argument.. and so on.
With the optimization in place we can go from (IIII)I to (JJJJ)I in one
step: bind in C as the 5th argument, apply expressions to the LF to call
it on each parameter in order. In this example we then end up with only
one additional BMH species rather than four.
In synthetic examples invoking methods with maximal number of arguments
where all arguments need the same filter to be applied, the number of
intermediary species and LF classes generated can be reduced from up to
~290 to 6 (98% reduction)[1], while in less contrived examples the
reduction is more modest, down to no difference in number of shapes
generated in completely randomized tests. Repeated filters and
conversions are common in practice, for example in the ISC
implementation, where this patch modestly reduce the number of shapes
and LFs generated in a range of tests - from substantial reductions down
to around 2% for mostly random argument sequences[2]. Without a formal
proof I posit that the upper bound of generated species and LFs is the
same with both implementations, just that this reduces the need to
generate certain intermediaries.
The proposed patch could surely be micro-optimized, e.g., there are
cases where I'm copying arrays and creating TreeMaps around. I do this
to simplify insertion of arguments into the right place and ensure
execution happens in the expected order etc. Some of that could perhaps
be specialized, but these overheads are tiny compared to the class
generation overheads we're avoiding.
Thanks!
/Claes
[1] http://cr.openjdk.java.net/~redestad/8213478/BadInvoke.java
Before:
time java -Xlog:class+load BadInvoke | wc -l
819
real 0m0.753s
user 0m4.120s
sys 0m0.164s
After:
time java -Xlog:class+load BadInvoke | wc -l
540
real 0m0.267s
user 0m0.580s
sys 0m0.044s
[2] Generated ISC stress tests such as:
http://cr.openjdk.java.net/~redestad/8213478/MixedStringCombinations.java
- which loads and creates 37936 classes before, 37114 after, or -2.2%
More information about the core-libs-dev
mailing list