RFR: JDK-8224963: Char-Byte Performance Enhancement
Adam Farley8
adam.farley at uk.ibm.com
Tue Jun 11 09:04:04 UTC 2019
Hi Vladmir,
Yes, that is what I see locally as well.
Merely patching the class library code will not effect any performance
enhancement.
Quite the opposite. :)
The changes in the class library were designed to work in tandem with some
changes to the OpenJ9 JIT compiler, which is what accelerates performance.
Naturally, investigating and sharing OpenJ9 logic is a bad idea due to
contamination, so we're stuck with investigating the CL changes to
identify any potential value to OpenJDK with Hotspot.
You mentioned:
> >> Among all options, I'm in favor of enhancing C2 to produce better
code.
> >> Then on my preference list goes rewriting JDK code to make it
amenable
> >> to missing optimizations (the patch you propose). And, as a last
resort,
> >> I'd consider introducing new intrinsics.
I wouldn't know where to start with enhancing C2, and since intrinsics are
dead last, that leaves "missing optimizations".
Is there a guide to code patterns the Hotspot JIT matches on for
optimizations?
Seems like that'd be a useful thing to have in many circumstances.
Best Regards
Adam Farley
IBM Runtimes
Vladimir Ivanov <vladimir.x.ivanov at oracle.com> wrote on 05/06/2019
19:15:27:
> From: Vladimir Ivanov <vladimir.x.ivanov at oracle.com>
> To: Adam Farley8 <adam.farley at uk.ibm.com>
> Cc: hotspot-compiler-dev at openjdk.java.net
> Date: 05/06/2019 19:15
> Subject: Re: RFR: JDK-8224963: Char-Byte Performance Enhancement
>
> Thanks, Adam.
>
> I ported ASCIIEncodingBenchmark to JMH [1] and then tried your patch [2]
> on x86 (Skylake). Unfortunately, I couldn't reproduce any improvements.
> Moreover, the patched version is slower:
>
> ASCIIEncodingBenchmark.charToByte
>
> BEFORE
>
> (numberOfChars) Mode Cnt Score Error Units
> 0 thrpt 5 111413694.893 ± 763410.998 ops/s
> 1 thrpt 5 87992741.333 ± 148702.045 ops/s
> 16 thrpt 5 53456010.326 ± 100007.754 ops/s
> 64 thrpt 5 27901519.758 ± 94808.481 ops/s
> 512 thrpt 5 4958471.149 ± 14774.009 ops/s
> 4096 thrpt 5 600449.051 ± 37885.719 ops/s
> 8192 thrpt 5 269098.868 ± 1634.533 ops/s
> 65536 thrpt 5 38855.167 ± 92.170 ops/s
> 1048576 thrpt 5 2630.800 ± 5.076 ops/s
>
>
> AFTER (w/ [2] applied)
>
> (numberOfChars) Mode Cnt Score Error Units
> 0 thrpt 5 119674686.849 ± 510819.775 ops/s
> 1 thrpt 5 80067544.958 ± 176550.132 ops/s
> 16 thrpt 5 47836555.989 ± 137233.882 ops/s
> 64 thrpt 5 22814214.962 ± 80747.066 ops/s
> 512 thrpt 5 3686203.220 ± 38087.643 ops/s
> 4096 thrpt 5 489024.453 ± 55092.933 ops/s
> 8192 thrpt 5 243057.291 ± 1483.443 ops/s
> 65536 thrpt 5 30503.779 ± 43.282 ops/s
> 1048576 thrpt 5 1879.556 ± 10.168 ops/s
>
> Best regards,
> Vladimir Ivanov
>
> [1]
> https://urldefense.proofpoint.com/v2/url?
>
u=http-3A__cr.openjdk.java.net_-7Evlivanov_afarley_8224963_benchmarks_src_main_java_org_benchmark_ASCIIEncodingBenchmark.java&d=DwID-
> g&c=jf_iaSHvJObTbx-siA1ZOg&r=P5m8KWUXJf-
>
CeVJc0hDGD9AQ2LkcXDC0PMV9ntVw5Ho&m=DhftUaO7rUKBkPBr7fj_QKxuhGMMaGVLpOOaH3S32mQ&s=bJmKa3fafyoSYAtBdtVGwupNh-2GBKXqmq1N3Xlk7dM&e=
>
> [2] https://urldefense.proofpoint.com/v2/url?
> u=http-3A__cr.openjdk.java.net_-7Eafarley_8224963_webrev_&d=DwID-
> g&c=jf_iaSHvJObTbx-siA1ZOg&r=P5m8KWUXJf-
>
CeVJc0hDGD9AQ2LkcXDC0PMV9ntVw5Ho&m=DhftUaO7rUKBkPBr7fj_QKxuhGMMaGVLpOOaH3S32mQ&s=FjbcOPZ3tu8D4vAty24tvlqXLT8-1urgxzI3PNHnwr4&e=
>
> On 31/05/2019 19:07, Adam Farley8 wrote:
> > Hi Vladimir,
> >
> > Here's a minimised version of the benchmark, which converts chars to
> > bytes using nio.
> >
> > I found that the conversion rates are similar between Hotspot and
OpenJ9
> > for encoding
> > single-character buffers, and that the difference becomes palpable as
> > you increase the
> > size of the buffer. 4096-char buffers, for example, show the 6x
> > difference I mentioned
> > earlier.
> >
> > This makes sense to me, as we're spending less time messing around
with
> > objects at the
> > test level, and more time actually utilising the encoding code.
> >
> > You should just be able to run the benchmark on the command line.
> >
> > "java ASCIIEncodingBenchmark <num of chars in buffer, per encoding>"
> >
> > Benchmark code:
> > https://urldefense.proofpoint.com/v2/url?
>
u=http-3A__cr.openjdk.java.net_-7Eafarley_8224963_ASCIIEncodingBenchmark.java&d=DwID-
> g&c=jf_iaSHvJObTbx-siA1ZOg&r=P5m8KWUXJf-
>
CeVJc0hDGD9AQ2LkcXDC0PMV9ntVw5Ho&m=DhftUaO7rUKBkPBr7fj_QKxuhGMMaGVLpOOaH3S32mQ&s=NZb-
> FdMcPTYj5JaL96_u4pjbApNczpZcIf28okScqqE&e=
> >
> > If you need a microbenchmark for a specific framework, name it and
I'll
> > get it done.
> >
> > Or one of my team will get it done. Off for a week. :)
> >
> > Best Regards
> >
> > Adam Farley
> > IBM Runtimes
> >
> >
> > Vladimir Ivanov <vladimir.x.ivanov at oracle.com> wrote on
29/05/201917:19:36:
> >
> >> From: Vladimir Ivanov <vladimir.x.ivanov at oracle.com>
> >> To: Adam Farley8 <adam.farley at uk.ibm.com>
> >> Cc: hotspot-compiler-dev at openjdk.java.net
> >> Date: 29/05/2019 17:23
> >> Subject: Re: RFR: JDK-8224963: Char-Byte Performance Enhancement
> >>
> >> Adam,
> >>
> >> Among all options, I'm in favor of enhancing C2 to produce better
code.
> >> Then on my preference list goes rewriting JDK code to make it
amenable
> >> to missing optimizations (the patch you propose). And, as a last
resort,
> >> I'd consider introducing new intrinsics.
> >>
> >> The microbenchmarks would help understand what pieces as missing in
C2
> >> and decide how to proceed.
> >>
> >> I haven't had HotSpot vs J9 comparison in mind, but in absence of
> >> benchmarks available comparing generated code (by C2) between
original
> >> and updated JDK version would help understand what goes wrong.
> >>
> >> Best regards,
> >> Vladimir Ivanov
> >>
> >> On 29/05/2019 17:53, Adam Farley8 wrote:
> >> > Hi Vladimir,
> >> >
> >> > I have a locally-written performance test I used to get the "6x".
> >> > Will chase up with the guy who wrote it to see if I can share it.
> >> > If not, I'll write a new one.
> >> >
> >> > As for the enhancements, two options are:
> >> >
> >> > - matching on the new method names, and replacing the inner logic
> >> > with some souped-up version of said logic.
> >> >
> >> > - alter the code to match on one of the C2 idioms, though I imagine
> >> > if it were that simple, OpenJDK would come with a list of said
> >> > idioms so everything people write can be easily accelerated by the
> >> > JIT.
> >> >
> >> > As for how OpenJ9 does it specifically, I don't know, and I suspect
> >> > it's safer if I don't find out, contamination-wise.
> >> >
> >> > Does any of that help?
> >> >
> >> > Best Regards
> >> >
> >> > Adam Farley
> >> > IBM Runtimes
> >> >
> >> >
> >> > Vladimir Ivanov <vladimir.x.ivanov at oracle.com> wrote on 29/05/
> 201913:22:27:
> >> >
> >> >> From: Vladimir Ivanov <vladimir.x.ivanov at oracle.com>
> >> >> To: Adam Farley8 <adam.farley at uk.ibm.com>, hotspot-compiler-
> >> >> dev at openjdk.java.net
> >> >> Date: 29/05/2019 13:22
> >> >> Subject: Re: RFR: JDK-8224963: Char-Byte Performance Enhancement
> >> >>
> >> >> Hi Adam,
> >> >>
> >> >> The bug mentions ~6x improvement in throughput. Are there have
any
> >> >> microbenchmarks you can share which demonstrate that? That
> would greatly
> >> >> simplify the analysis of changes you propose.
> >> >>
> >> >> Also, if you can elaborate on what optimization opportunities C2
misses
> >> >> in original code, please, do.
> >> >>
> >> >> Best regards,
> >> >> Vladimir Ivanov
> >> >>
> >> >> On 29/05/2019 12:45, Adam Farley8 wrote:
> >> >> > Hi All,
> >> >> >
> >> >> > Could someone familiar with the Hotspot JIT please review
> and opine on
> >> >> > the below?
> >> >> >
> >> >> > The Char-Byte encoding/decoding methods inside some of the
sun.nio.cs
> >> >> > classes
> >> >> > (such as US_ASCII) see a lot of use, and OpenJDK on the OpenJ9
> >> VM seems to
> >> >> > do this a lot faster.
> >> >> >
> >> >> > Is it possible to achieve a similar improvement on OpenJDK
> on Hotspot by
> >> >> > tweaking the CL code to match Hotspot JIT compiler idioms, or
by
> >> >> > introducing
> >> >> > a method name for the HS JIT to match on?
> >> >> >
> >> >> > An example of these changes to US_ASCII.java is linked
> below. No OpenJ9
> >> >> > code
> >> >> > is included in the work item or the webrev, to avoid
contamination.
> >> >> >
> >> >> > Work item: https://urldefense.proofpoint.com/v2/url?
> >> >> u=https-3A__bugs.openjdk.java.net_browse_JDK-2D8224963&d=DwIC-
> >> >> g&c=jf_iaSHvJObTbx-siA1ZOg&r=P5m8KWUXJf-
> >> >>
CeVJc0hDGD9AQ2LkcXDC0PMV9ntVw5Ho&m=4XPqGhxLchCLvSQhTIu3Wvm63NE2XpuEJf-
> >> >> PzjFCXb4&s=2ChxP3IE0tkvevxSXfil3PGlpEHkUPxgwMxHH5J-A34&e=
> >> >> >
> >> >> > Example Webrev: _https://urldefense.proofpoint.com/v2/url?
> >> >>
u=http-3A__cr.openjdk.java.net_-7Eafarley_8224963_webrev_-5F&d=DwIC-
> >> >> g&c=jf_iaSHvJObTbx-siA1ZOg&r=P5m8KWUXJf-
> >> >>
CeVJc0hDGD9AQ2LkcXDC0PMV9ntVw5Ho&m=4XPqGhxLchCLvSQhTIu3Wvm63NE2XpuEJf-
> >> >> PzjFCXb4&s=fCeNvvk3Fehc6ssZfoNkJao_NJyoxeov7cxiyMSvuwQ&e=
> >> >> >
> >> >> > Best Regards
> >> >> >
> >> >> > Adam Farley
> >> >> > IBM Runtimes
> >> >> >
> >> >> > Unless stated otherwise above:
> >> >> > IBM United Kingdom Limited - Registered in England and
> Wales with number
> >> >> > 741598.
> >> >> > Registered office: PO Box 41, North Harbour, Portsmouth,
> >> Hampshire PO6 3AU
> >> >>
> >> >
> >> > Unless stated otherwise above:
> >> > IBM United Kingdom Limited - Registered in England and Wales with
number
> >> > 741598.
> >> > Registered office: PO Box 41, North Harbour, Portsmouth,
> Hampshire PO6 3AU
> >>
> >
> > Unless stated otherwise above:
> > IBM United Kingdom Limited - Registered in England and Wales with
number
> > 741598.
> > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
3AU
>
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20190611/d2cd2ca9/attachment-0001.html>
More information about the hotspot-compiler-dev
mailing list