[vector] RFR 8221816: IndexOutOfBoundsException for fromArray/intoArray with unset mask lanes - was: RE: IndexOutOfBoundsException with unset mask lanes
Kharbas, Kishor
kishor.kharbas at intel.com
Thu May 30 20:34:33 UTC 2019
Hi Joshua,
Thanks for the patch. Here is the performance of the jmh benchmark on Intel CLX.
Results with slow path are as expected - having an uncommon trap over a regular check does give performance benefit.
I am not sure why the performance gain in fast path (for 1000 case). The same is observed in ARM's case. Do you know why?
Results on Intel CLX:
Base without expectTrue (patch [1]) UncommonTrap (patch [2])
1000 fastPath 416.251 ± 3.086 538.024 ± 3.415 538.871 ± 2.779
10000 fastPath 29.426 ± 3.625 30.389 ± 1.389 30.673 ± 0.354
100000 fastPath 2.018 ± 0.060 2.013 ± 0.005 2.045 ± 0.063
1000 fastPath + 1 slowPath N/A 70.835 ± 10.902 72.236 ± 0.797
10000 fastPath + 1 slowPath N/A 6.512 ± 0.466 19.276 ± 1.769
100000 fastPath + 1 slowPath N/A 0.756 ± 0.016 1.955 ± 0.017
Thanks,
Kishor
> -----Original Message-----
> From: Vladimir Ivanov [mailto:vladimir.x.ivanov at oracle.com]
> Sent: Thursday, May 30, 2019 8:43 AM
> To: Joshua Zhu (Arm Technology China) <Joshua.Zhu at arm.com>; panama-
> dev at openjdk.java.net
> Cc: nd <nd at arm.com>; Kharbas, Kishor <kishor.kharbas at intel.com>
> Subject: Re: [vector] RFR 8221816: IndexOutOfBoundsException for
> fromArray/intoArray with unset mask lanes - was: RE:
> IndexOutOfBoundsException with unset mask lanes
>
> Nice work, Joshua! I like how it shapes out.
>
> Good to see some performance numbers as well.
>
> Kishor did some experiments with masked memory operations, so I'd like to
> hear his thoughts on the patch.
>
> Overall, I'm happy with the change as it is now. It solves the correctness issue
> without sacrificing performance in most of the cases.
>
> The only important case which suffers is "masked loops" - when masked
> kernel is used to avoid explicit loop tail processing. Such loops will end with a
> deopt most of the time (unless loop tail is empty). So, as a next step, I'd like
> to see that case to be addressed, so such loops don't trigger uncommon trap.
>
> In a longer term, it's worth looking into specific optimizations range checks
> for masked accesses.
>
> Some comments about the code changes:
>
> src/hotspot/share/opto/opaquenode.hpp:
>
> You have some changes in ProfileBooleanNode. Why providing 0 as a
> false_count is not enough to get uncommon trap on corresponding branch?
>
>
> src/hotspot/share/opto/doCall.cpp:
>
> bool Compile::should_delay_vector_inlining(ciMethod* call_method,
> JVMState* jvms) {
> - return UseVectorApiIntrinsics && call_method->is_vector_method();
> + return UseVectorApiIntrinsics && call_method->is_vector_method() &&
> + call_method->intrinsic_id() != vmIntrinsics::_ExpectTrue;
>
> Why do you need to delay inlining for it?
>
>
> src/jdk.incubator.vector/share/classes/jdk/incubator/vector/X-
> VectorBits.java.template:
>
> - if (ax >= 0 && ax + LENGTH <= a.length) {
> + if (expectTrue(ax >= 0 && ax + LENGTH <= a.length)) {
>
> Why don't you use VectorIntrinsics.checkIndex() here?
>
> Best regards,
> Vladimir Ivanov
>
> On 10/05/2019 09:41, Joshua Zhu (Arm Technology China) wrote:
> > Hi Vladimir,
> >
> >> It looks promising to introduce a variant of
> >> VectorIntrinsics.checkIndex() which is used to guard a fast path and is
> >> annotated with a JIT-compiler hint (akin to
> >> java.lang.invoke.MethodHandleImpl.profileBoolean() [1], but without
> >> profiling logic) to override bytecode profiling info, so JIT always puts an
> >> uncommon trap on the false branch.
> >
> > Thanks for your comments.
> > As you suggested, I introduced VectorIntrinsics.expectTrue() in change [2].
> > It's used as below:
> > if (expectTrue(bool condition)) {
> > // fast path
> > } else {
> > // slow path: uncommon trap
> > }
> >
> > I also wrote a jmh case [3] to check the performance.
> > See below table for jmh test results. (In Throughput Mode, Unit: ops/ms)
> > Base without expectTrue (patch [1]) UncommonTrap
> (patch [2])
> > 1000 fastPath 318.228 ± 22.588 457.967 ± 12.622 457.328 ±
> 11.932
> > 10000 fastPath 21.991 ± 2.496 23.360 ± 0.070 24.744 ±
> 0.213
> > 100000 fastPath 1.613 ± 0.007 1.581 ± 0.031 1.631 ± 0.003
> > 1000 fastPath + 1 slowPath N/A 57.298 ± 11.033 55.845 ±
> 0.716
> > 10000 fastPath + 1 slowPath N/A 4.537 ± 0.536 15.164 ±
> 0.098
> > 100000 fastPath + 1 slowPath N/A 0.577 ± 0.048 1.564 ±
> 0.005
> >
> > [1] http://cr.openjdk.java.net/~jzhu/vectorapi/8221816.OOB/webrev.01/
> > [2]
> http://cr.openjdk.java.net/~jzhu/vectorapi/8221816.OOB/uncommontrap.w
> ebrev.00/
> > [3]
> http://cr.openjdk.java.net/~jzhu/vectorapi/8221816.OOB/IntVectorJmhTest
> .java
> >
> > Please help review and feel free to share your comments.
> > Thanks.
> >
> > Best Regards,
> > Joshua
> >
More information about the panama-dev
mailing list