VectorBox enabling
Lupusoru, Razvan A
razvan.a.lupusoru at intel.com
Fri Nov 17 22:52:58 UTC 2017
Hey Vladimir,
I have noticed that inlining when graph is being created is not equivalent to late inlining. In the following example, I see different behavior:
static <S extends Vector.Shape<Vector<?, ?>>> void VecSaxpy(FloatVector.FloatSpecies<S> fspec, float[] a, int a_offset,
float[] b, int b_offset, float alpha) {
FloatVector<S> alphaVec = fspec.broadcast(alpha);
for (int i = 0; (i + a_offset + fspec.length()) < a.length && (i + b_offset + fspec.length()) < b.length; i += fspec.length()) {
FloatVector<S> bv = fspec.fromArray(b, i + b_offset);
FloatVector<S> av = fspec.fromArray(a, i + a_offset);
bv.add(av.mul(alphaVec)).intoArray(b, i + b_offset);
}
}
When late inlining is disabled for Vector API, using -XX:+DebugVectorApi flag, I see following message:
=== NOT eliminating VectorBox due to uses ===
196 VectorBox === 188 _ 193 _ _ ( 49 27 91 95 195 ) [[ 197 199 200 ]] jdk/incubator/vector/Float256Vector:NotNull:exact * ( java/lang/Object:NotNull *, vectory[8]:{float}, java/lang/Object:NotNull *, memory, memory ) !jvms: BLAS::VecSaxpy @ bci:3
231 CallStaticJava === 1029 85 197 8 9 ( 230 217 11 12 13 14 1 200 29 1 1 12 217 ) [[ 232 ]] # Static uncommon_trap(reason='speculate_class_check' action='maybe_recompile' debug_id='0') void ( int ) C=0.000100 entry=0x00007fc30f97de20 BLAS::VecSaxpy @ bci:16 !jvms: BLAS::VecSaxpy @ bci:16
-------------------------------
Basically, it fails to eliminate VectorBox (and corresponding allocations) due to use in an uncommon_trap.
Interestingly enough, when late inlining is enabled, the VectorBox is eliminated as desired because no equivalent uncommon_trap is inserted. I am unsure of the reason for this mismatch in behavior. The difference in performance ends up being about 40x on test case above when tested on AVX2.
Let me know if you have any recommendations or understanding of the issue noted above.
Thanks,
Razvan
-----Original Message-----
From: Vladimir Ivanov [mailto:vladimir.x.ivanov at oracle.com]
Sent: Friday, November 17, 2017 8:20 AM
To: Lupusoru, Razvan A <razvan.a.lupusoru at intel.com>; 'panama-dev at openjdk.java.net' <panama-dev at openjdk.java.net>
Subject: Re: VectorBox enabling
Another thing I was curious about is delayed inlining of vector ops:
+ } else if (should_delay_vector_inlining(callee, jvms)) {
+ assert(!delayed_forbidden, "delay should be allowed");
+ return CallGenerator::for_late_inline(callee, cg);
+bool Compile::should_delay_vector_inlining(ciMethod* call_method,
JVMState* jvms) {
+ return call_method->is_vector_method(); }
Can you elaborate, please, when does it help?
Best regards,
Vladimir Ivanov
On 11/15/17 4:37 AM, Lupusoru, Razvan A wrote:
> Hi everyone,
>
> VectorBox enabling is now mostly complete and appearing to be functional. The VectorBox supports being able to generate objects for all supported Vector objects that have some intrinsic method. This includes GenericMask (subject to some limitation noted below). Additionally, VectorBox nodes can be removed along with their allocations in cases when the objects do not need created. I have tested BLAS (saxpy, sdot) and Sepia demo used in JavaOne and performance has not regressed.
>
> Please see attached patch and if there are no concerns, I will merge tomorrow.
> http://cr.openjdk.java.net/~rlupusoru/panama/webrev_vectorbox_04/
>
> Note that in patch you will find some "FIXME" related to masks (namely mask shape and type recovery is not possible at times during intrinsification). After this patch, I will look into solving this problem by potentially having specialized masks for each type and shape combination (as is done for species).
>
> The main limitations remaining with VectorBox are as follows:
>
> - If VectorBox is used by any non-intrinsified calls, stores to heap, or runtime calls via deopt, it will generate an object at the original call site. The plan is to move this to slow path when Vector API object identities can be ignored.
>
> - VectorBox for GenericMask does not set the Species field. This will either be fixed in a follow-up patch or the approach for specialized masks will be employed instead.
>
> Thanks so much!
>
> --Razvan
>
More information about the panama-dev
mailing list