RFR: JDK-8224963: Char-Byte Performance Enhancement
Vladimir Ivanov
vladimir.x.ivanov at oracle.com
Wed Jun 5 18:15:27 UTC 2019
Thanks, Adam.
I ported ASCIIEncodingBenchmark to JMH [1] and then tried your patch [2]
on x86 (Skylake). Unfortunately, I couldn't reproduce any improvements.
Moreover, the patched version is slower:
ASCIIEncodingBenchmark.charToByte
BEFORE
(numberOfChars) Mode Cnt Score Error Units
0 thrpt 5 111413694.893 ± 763410.998 ops/s
1 thrpt 5 87992741.333 ± 148702.045 ops/s
16 thrpt 5 53456010.326 ± 100007.754 ops/s
64 thrpt 5 27901519.758 ± 94808.481 ops/s
512 thrpt 5 4958471.149 ± 14774.009 ops/s
4096 thrpt 5 600449.051 ± 37885.719 ops/s
8192 thrpt 5 269098.868 ± 1634.533 ops/s
65536 thrpt 5 38855.167 ± 92.170 ops/s
1048576 thrpt 5 2630.800 ± 5.076 ops/s
AFTER (w/ [2] applied)
(numberOfChars) Mode Cnt Score Error Units
0 thrpt 5 119674686.849 ± 510819.775 ops/s
1 thrpt 5 80067544.958 ± 176550.132 ops/s
16 thrpt 5 47836555.989 ± 137233.882 ops/s
64 thrpt 5 22814214.962 ± 80747.066 ops/s
512 thrpt 5 3686203.220 ± 38087.643 ops/s
4096 thrpt 5 489024.453 ± 55092.933 ops/s
8192 thrpt 5 243057.291 ± 1483.443 ops/s
65536 thrpt 5 30503.779 ± 43.282 ops/s
1048576 thrpt 5 1879.556 ± 10.168 ops/s
Best regards,
Vladimir Ivanov
[1]
http://cr.openjdk.java.net/~vlivanov/afarley/8224963/benchmarks/src/main/java/org/benchmark/ASCIIEncodingBenchmark.java
[2] http://cr.openjdk.java.net/~afarley/8224963/webrev/
On 31/05/2019 19:07, Adam Farley8 wrote:
> Hi Vladimir,
>
> Here's a minimised version of the benchmark, which converts chars to
> bytes using nio.
>
> I found that the conversion rates are similar between Hotspot and OpenJ9
> for encoding
> single-character buffers, and that the difference becomes palpable as
> you increase the
> size of the buffer. 4096-char buffers, for example, show the 6x
> difference I mentioned
> earlier.
>
> This makes sense to me, as we're spending less time messing around with
> objects at the
> test level, and more time actually utilising the encoding code.
>
> You should just be able to run the benchmark on the command line.
>
> "java ASCIIEncodingBenchmark <num of chars in buffer, per encoding>"
>
> Benchmark code:
> http://cr.openjdk.java.net/~afarley/8224963/ASCIIEncodingBenchmark.java
>
> If you need a microbenchmark for a specific framework, name it and I'll
> get it done.
>
> Or one of my team will get it done. Off for a week. :)
>
> Best Regards
>
> Adam Farley
> IBM Runtimes
>
>
> Vladimir Ivanov <vladimir.x.ivanov at oracle.com> wrote on 29/05/2019 17:19:36:
>
>> From: Vladimir Ivanov <vladimir.x.ivanov at oracle.com>
>> To: Adam Farley8 <adam.farley at uk.ibm.com>
>> Cc: hotspot-compiler-dev at openjdk.java.net
>> Date: 29/05/2019 17:23
>> Subject: Re: RFR: JDK-8224963: Char-Byte Performance Enhancement
>>
>> Adam,
>>
>> Among all options, I'm in favor of enhancing C2 to produce better code.
>> Then on my preference list goes rewriting JDK code to make it amenable
>> to missing optimizations (the patch you propose). And, as a last resort,
>> I'd consider introducing new intrinsics.
>>
>> The microbenchmarks would help understand what pieces as missing in C2
>> and decide how to proceed.
>>
>> I haven't had HotSpot vs J9 comparison in mind, but in absence of
>> benchmarks available comparing generated code (by C2) between original
>> and updated JDK version would help understand what goes wrong.
>>
>> Best regards,
>> Vladimir Ivanov
>>
>> On 29/05/2019 17:53, Adam Farley8 wrote:
>> > Hi Vladimir,
>> >
>> > I have a locally-written performance test I used to get the "6x".
>> > Will chase up with the guy who wrote it to see if I can share it.
>> > If not, I'll write a new one.
>> >
>> > As for the enhancements, two options are:
>> >
>> > - matching on the new method names, and replacing the inner logic
>> > with some souped-up version of said logic.
>> >
>> > - alter the code to match on one of the C2 idioms, though I imagine
>> > if it were that simple, OpenJDK would come with a list of said
>> > idioms so everything people write can be easily accelerated by the
>> > JIT.
>> >
>> > As for how OpenJ9 does it specifically, I don't know, and I suspect
>> > it's safer if I don't find out, contamination-wise.
>> >
>> > Does any of that help?
>> >
>> > Best Regards
>> >
>> > Adam Farley
>> > IBM Runtimes
>> >
>> >
>> > Vladimir Ivanov <vladimir.x.ivanov at oracle.com> wrote on 29/05/201913:22:27:
>> >
>> >> From: Vladimir Ivanov <vladimir.x.ivanov at oracle.com>
>> >> To: Adam Farley8 <adam.farley at uk.ibm.com>, hotspot-compiler-
>> >> dev at openjdk.java.net
>> >> Date: 29/05/2019 13:22
>> >> Subject: Re: RFR: JDK-8224963: Char-Byte Performance Enhancement
>> >>
>> >> Hi Adam,
>> >>
>> >> The bug mentions ~6x improvement in throughput. Are there have any
>> >> microbenchmarks you can share which demonstrate that? That would greatly
>> >> simplify the analysis of changes you propose.
>> >>
>> >> Also, if you can elaborate on what optimization opportunities C2 misses
>> >> in original code, please, do.
>> >>
>> >> Best regards,
>> >> Vladimir Ivanov
>> >>
>> >> On 29/05/2019 12:45, Adam Farley8 wrote:
>> >> > Hi All,
>> >> >
>> >> > Could someone familiar with the Hotspot JIT please review and opine on
>> >> > the below?
>> >> >
>> >> > The Char-Byte encoding/decoding methods inside some of the sun.nio.cs
>> >> > classes
>> >> > (such as US_ASCII) see a lot of use, and OpenJDK on the OpenJ9
>> VM seems to
>> >> > do this a lot faster.
>> >> >
>> >> > Is it possible to achieve a similar improvement on OpenJDK on Hotspot by
>> >> > tweaking the CL code to match Hotspot JIT compiler idioms, or by
>> >> > introducing
>> >> > a method name for the HS JIT to match on?
>> >> >
>> >> > An example of these changes to US_ASCII.java is linked below. No OpenJ9
>> >> > code
>> >> > is included in the work item or the webrev, to avoid contamination.
>> >> >
>> >> > Work item: https://urldefense.proofpoint.com/v2/url?
>> >> u=https-3A__bugs.openjdk.java.net_browse_JDK-2D8224963&d=DwIC-
>> >> g&c=jf_iaSHvJObTbx-siA1ZOg&r=P5m8KWUXJf-
>> >> CeVJc0hDGD9AQ2LkcXDC0PMV9ntVw5Ho&m=4XPqGhxLchCLvSQhTIu3Wvm63NE2XpuEJf-
>> >> PzjFCXb4&s=2ChxP3IE0tkvevxSXfil3PGlpEHkUPxgwMxHH5J-A34&e=
>> >> >
>> >> > Example Webrev: _https://urldefense.proofpoint.com/v2/url?
>> >> u=http-3A__cr.openjdk.java.net_-7Eafarley_8224963_webrev_-5F&d=DwIC-
>> >> g&c=jf_iaSHvJObTbx-siA1ZOg&r=P5m8KWUXJf-
>> >> CeVJc0hDGD9AQ2LkcXDC0PMV9ntVw5Ho&m=4XPqGhxLchCLvSQhTIu3Wvm63NE2XpuEJf-
>> >> PzjFCXb4&s=fCeNvvk3Fehc6ssZfoNkJao_NJyoxeov7cxiyMSvuwQ&e=
>> >> >
>> >> > Best Regards
>> >> >
>> >> > Adam Farley
>> >> > IBM Runtimes
>> >> >
>> >> > Unless stated otherwise above:
>> >> > IBM United Kingdom Limited - Registered in England and Wales with number
>> >> > 741598.
>> >> > Registered office: PO Box 41, North Harbour, Portsmouth,
>> Hampshire PO6 3AU
>> >>
>> >
>> > Unless stated otherwise above:
>> > IBM United Kingdom Limited - Registered in England and Wales with number
>> > 741598.
>> > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
>>
>
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number
> 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
More information about the hotspot-compiler-dev
mailing list