[vector] RFR 8221816: IndexOutOfBoundsException for fromArray/intoArray with unset mask lanes - was: RE: IndexOutOfBoundsException with unset mask lanes
Joshua Zhu (Arm Technology China)
Joshua.Zhu at arm.com
Fri May 31 12:32:54 UTC 2019
Hi Vladimir,
Thanks a lot for your review and comments.
> The only important case which suffers is "masked loops" - when masked
> kernel is used to avoid explicit loop tail processing. Such loops will end with a
> deopt most of the time (unless loop tail is empty). So, as a next step, I'd like
> to see that case to be addressed, so such loops don't trigger uncommon trap.
I agree. If masked kernel is used to avoid explicit loop tail processing, uncommon trap will not be suitable.
In my initial thought, in long term masked access should behave in the way like:
Main loop (fast path in my patch which succeed in range check) will work same with current implementation.
For slow path, masked operation will be intrinsified into corresponding instructions on mask-supported platform.
Otherwise it will go scalar.
But from review comments on my first patch [1], if mask operation is not directly supported by hardware,
scalar on slow path will cause problems if it can't prove access is always in-bounds and prune the slow path.
I think the problem is, without uncommon trap, VectorBox will be generated in fast path, for example, after blend for fromArray.
This results in Vector will be stored into memory instead of simd register. Is my understanding correct?
Anyway let's firstly solve the API correctness issue.
> src/hotspot/share/opto/opaquenode.hpp:
>
> You have some changes in ProfileBooleanNode. Why providing 0 as a
> false_count is not enough to get uncommon trap on corresponding branch?
When determine whether uncommon trap is suitable to generate, besides branch prediction probabilities,
Hotspot will also check if there are too many traps at the current method and bci by seems_stable_comparison(). [2]
> src/hotspot/share/opto/doCall.cpp:
>
> bool Compile::should_delay_vector_inlining(ciMethod* call_method,
> JVMState* jvms) {
> - return UseVectorApiIntrinsics && call_method->is_vector_method();
> + return UseVectorApiIntrinsics && call_method->is_vector_method() &&
> + call_method->intrinsic_id() != vmIntrinsics::_ExpectTrue;
>
> Why do you need to delay inlining for it?
For ProfileBooleanNode, profile has to be consumed before GVN elimination. [3]
Therefore expectTrue inlining should NOT be delayed.
> src/jdk.incubator.vector/share/classes/jdk/incubator/vector/X-
> VectorBits.java.template:
>
> - if (ax >= 0 && ax + LENGTH <= a.length) {
> + if (expectTrue(ax >= 0 && ax + LENGTH <= a.length)) {
>
> Why don't you use VectorIntrinsics.checkIndex() here?
Uncommon trap will not always be put on the false branch unless loop tail is empty. [4]
[1] http://mail.openjdk.java.net/pipermail/panama-dev/2019-April/005105.html
[2] https://hg.openjdk.java.net/panama/dev/file/0bea74e4f0eb/src/hotspot/share/opto/parse2.cpp#l1644
[3] https://hg.openjdk.java.net/panama/dev/file/0bea74e4f0eb/src/hotspot/share/opto/opaquenode.cpp#l87
[4] https://hg.openjdk.java.net/panama/dev/file/0bea74e4f0eb/src/hotspot/share/opto/library_call.cpp#l1287
Best Regards,
Joshua
More information about the panama-dev
mailing list