[intrinsics]: performance before after (String::format)
Alex Buckley
alex.buckley at oracle.com
Fri Feb 22 21:46:20 UTC 2019
Hi Vicente,
Thanks for this nice testing. I am interested in the following results:
1. Variable strings
Intrinsified Vanilla Speedup
testStringFormat1VariableStr 10443 2394 44x
testStringFormat2VariableStrs 9497 88 109x
testStringFormat100VariableStrs 148 3 46x
With no type conversions in the mix, any speedup comes from evaluating
the format string at compile time and emitting a series of string
concatenations for run time. Each and every concat emitted under this
intrinsification scheme is a win relative to vanilla invocation.
1VariableStr does 1 concat (arg + space) while 2VariableStr does 3
concats (arg1 + text + arg2 + space), so 2VariableStr should win 3x as
much due to intrinsification; indeed its speedup is almost 3x greater
than 1VariableStr (109x/44x). Is that the right way to think about
what's going on?
2. Constant ints
Intrinsified Vanilla Speedup
testStringFormat1ConstantInt 10023 225 45x
testStringFormat2ConstantInt 9636 126 77x
testStringFormat100ConstantInts 10 3 4x
You're no longer taking advantage of all-constant arguments to perform
the invocation wholly at compile time, so the low speedup for 100
arguments is to be expected. It's nice to see the speedup column
following the same pattern for constant versus non-constant arguments:
Intrinsified Vanilla Speedup
testStringFormat1VariableInt 10037 194 52x
testStringFormat2VariableInts 9665 114 85x
testStringFormat100VariableInts 10 3 3x
Presumably, type conversions at run time are making the 2*Int cases give
less of a speedup over 1*Int (85x/52x above) than 2*Str gives over 1*Str
(109x/44x).
Alex
On 2/22/2019 9:33 AM, Vicente Romero wrote:
> Hi,
>
> I have executed some performance tests on the intrinsics code to compare
> the before and after. Please find the benchmark results and the JMH
> based benchmark attached. This benchmark is based on a previous one
> written by Hannes. The benchmark compares the execution between the JDK
> built from [1], referred here as JDK13, and [2] which is the amber repo,
> branch `intrinsics-project`.
>
> Some conclusions from the benchmark results:
>
> * the intrinsified code is faster in all cases, for which intrinsified
> code is produced, compared to the legit (JDK13 vanilla) code
> * there are wide variations though
>
> For example for the test: `testStringFormatBoxedArray` which is
> basically benchmarking the performance of: `String.format("%s: %d ",
> args);` where args is: `static final Object[] args = { "Bob", i23 };`,
> there is basically no visible gain as in this case the intrinsification
> is bailing out and producing same code as vanilla JDK13. This result is
> expected. The next test with not so much gain is:
> `testStringFormat1ConstantFloat` which is testing:
>
> `String.format("%g", 1.0)`
>
> the execution is ~2.5 times faster in the intrinsified version but
> nothing compared to: `testStringFormat1ConstantStr` which is ~40 times
> faster. Another interesting conclusion is that the improvement fades out
> with the number of parameters for some cases but keeps constant for
> others. For example it is as fast to concatenate 1 or 100 strings but
> formating one primitive int is ~45 times faster vs a 3.5 improvement
> when formating a hundred.
>
> I have also attached the table I used to play with the numbers.
>
> Thanks,
>
> Vicente
>
> [1] http://hg.openjdk.java.net/jdk/jdk
>
> [2] http://hg.openjdk.java.net/amber/amber
>
More information about the amber-dev
mailing list