MVT-based vectors: first stab

Vladimir Ivanov vladimir.x.ivanov at oracle.com
Wed Jun 28 16:17:50 UTC 2017


> Nice work.
> 
> I notice for the factory you are chaining a sequence of findWithers (vwithfield), one for each field. I am guessing if you construct via the VCC and then unbox that might cause issues with temporary boxing?

Yes, I deliberately tried to avoid any unnecessary boxing to make 
JIT-compiler life easier. The main question I had when starting the 
experiment was is it possible to completely avoid/eliminate 
boxing/buffering in generated code. Because absence of boxing/buffering 
operations in generated code is crucial for vectors.

> I think it would be useful to add that functionality to ValueType (with the caveat as with findWither that the VCC constructor is bypassed, which works fine for “vanilla” VCCs that just put fields and have no constraints). If necessary we could spin specific byte code using the builder to reduce the LF and invocation ceremony.

Yes, it looks like a good candidate for inclusion. Method handle chain 
is quite complex, so it'll definitely benefit from specialized bytecode 
(vdefault; vwithfield DVT.f1; ...; vwithfield DVT.fn; vreturn;)

Best regards,
Vladimir Ivanov

> 
> Paul.
> 
> 
> 
>> On 28 Jun 2017, at 06:17, Vladimir Ivanov <vladimir.x.ivanov at oracle.com> wrote:
>>
>> Hi,
>>
>> I started playing with MVT-based vectors recently and want to share my experience so far.
>>
>> For the first step, I decided to avoid specific treatment of super-longs (Long2/4/8) in the JVM we have in Panama and just mark them as VCCs for now:
>>   @DeriveValueType
>>   final class Long2 { public final long lo, hi; }
>>
>> As a test case I chose sum of array elements:
>>
>>   long sum(QLong2[] va) {
>>     QLong2 v = QLong2.default; // (0,0)
>>     for (int i = 0; i < va.length; i++) {
>>       QLong2 vi = va[i];
>>       v = QLong2(v.lo + vi.lo, v.hi + vi.hi);
>>     }
>>     return v.lo + v.hi;
>>   }
>>
>> Here are the changes:
>> http://cr.openjdk.java.net/~vlivanov/valhalla/vectors.mvt/webrev.00/
>>
>> SUMMARY & RESULTS
>>
>> I hit some bugs along the way and had to extend the code for Q-types in some places to get decent code shape generated. (I'll shepherd the fixes in the repo separately.) I ended up with the following generated code:
>>
>> http://cr.openjdk.java.net/~vlivanov/valhalla/vectors.mvt/sumArrayL2.log
>>
>> Unfortunately, there are still 2 allocations left in the generated code (out of 5 initially). C2 can't scalarize the accumulator (v) inside the loop and has to put updated components (lo & hi) in allocated heap buffer on every iteration over the array (generated code [1], C2 EA output [2]).
>>
>>
>> GORY DETAILS
>>
>>   hotspot/src/share/vm/opto/type.cpp
>>
>>     * Type::get_typeflow_type() doesn't distinguish between TypeValueTypePtr & TypeValueType
>>
>>
>>   jdk/src/java.base/share/classes/java/lang/invoke/LambdaFormBuilder.java
>>
>>     * introduced some LF intrinsics, e.g. LOOP and SELECT_ALTERNATIVE;
>>
>>     * had to workaround control flow-related bugs in the bytecode library (jdk.experimental.bytecode): broken JVM state after goto_() (see FIXME marks in the code);
>>
>>     * fixed 2-slot types support: wrong local indexes were used
>>
>>
>> jdk/src/java.base/share/classes/java/lang/invoke/MethodHandleImpl.java
>>
>>     * MHI.unboxResultHandle misses unboxing step for Q-types
>>
>>
>>   jdk/src/java.base/share/classes/java/lang/invoke/MethodHandles.java
>>   jdk/src/java.base/share/classes/jdk/experimental/value/ValueType.java
>>
>>     * had to add a specialization for arraylength on Q-typed array, because C2 can't handle arraylength Object[] when Q[] instance is on stack;
>>
>> jdk/src/java.base/share/classes/jdk/experimental/bytecode/TypedCodeBuilder.java:
>>
>>     * changes required to workaround goto_() bug
>>
>>     * fixed CP patching support
>>
>>
>> jdk/src/java.base/share/classes/jdk/experimental/value/MethodHandleBuilder.java
>>
>>     * fixed CP patching support
>>
>>
>>   hotspot/src/share/vm/memory/metachunk.hpp
>>   hotspot/src/share/vm/oops/method.hpp
>>     Fix optimized JVM build.
>>
>> Best regards,
>> Vladimir Ivanov
>>
>> [1] http://cr.openjdk.java.net/~vlivanov/valhalla/vectors.mvt/sumArrayL2.log
>>
>> ;; B7: # B8 <- B9  top-of-loop
>>
>> mov    %r9,0x10(%rsp)
>> mov    0x8(%rsp),%r11d
>>
>> ;; B8: # B16 B9 <- B6 B7   Loop: B8-B7 inner
>>
>> mov    %r11d,0x8(%rsp)
>> movabs $0x7c0060420,%rsi  ;   {metadata('valhalla/vector/Long2$Value')}
>> ...
>> callq  0x0000000117579820 ;   {runtime_call _new_instance_Java}
>>
>> ;; B9: # B7 B10 <- B8
>>
>> mov    %rax,%r9
>> mov    0x10(%rsp),%r11
>> mov    0x10(%r11),%r10
>> mov    0x18(%r11),%r11
>> movslq %ebp,%r8
>> shl    $0x4,%r8
>> mov    (%rsp),%rcx
>> mov    0x18(%rcx,%r8,1),%rax
>> mov    0x10(%rcx,%r8,1),%r8  ;*vaload
>> add    %r11,%rax          ;*ladd
>> mov    %rax,0x18(%r9)
>> add    %r10,%r8           ;*ladd
>> mov    %r8,0x10(%r9)
>> inc    %ebp               ;*iadd
>> cmp    0x8(%rsp),%ebp
>> jl     <B7>               ;*if_icmpge
>>
>> [2] Pruned EA output (full output in the log)
>> ======== Connection graph for  valhalla.vector.VectorTest::sumArrayL2
>> JavaObject NoEscape(NoEscape) NSR [ 1267F 1265F 756F 754F 1581F 1579F [ 1259 1264 532 1525 ]]   1247	Allocate
>> LocalVar [ 1247P [ 1264 ]]   1259	Proj ... #5
>> LocalVar [ 1259 1247P [ 1267b 1265b 532 1525 ]]   1264	CheckCastPP ... #valuetype* valhalla/vector/Long2$Value:NotNull
>> LocalVar [ 324 1264 1247P 307P [ 756b 754b ]]   532	Phi ... #valuetype* valhalla/vector/Long2$Value:NotNull
>> LocalVar [ 1264 324 1247P 307P [ 1581b 1579b ]]   1525	Phi ... #valuetype* valhalla/vector/Long2$Value:NotNull
>>
>> JavaObject NoEscape(NoEscape) [ 1154F [ 1148 1153 ]]   1136	Allocate
>> LocalVar [ 1136P [ 1153 ]]   1148	Proj ... #5
>> LocalVar [ 1148 1136P [ 1154b ]]   1153	CheckCastPP ... #valuetype* valhalla/vector/Long2$Value:NotNull
>>
>> JavaObject NoEscape(NoEscape) [ [ 1053 ]]   1041	Allocate
>> LocalVar [ 1041P [ ]]   1053	Proj ... #5
>>
>> JavaObject NoEscape(NoEscape) [ 653F 651F [ 645 650 ]]   633	Allocate
>> LocalVar [ 633P [ 650 ]]   645	Proj ... #5
>> LocalVar [ 645 633P [ 653b 651b ]]   650	CheckCastPP ... #valuetype* valhalla/vector/Long2$Value:NotNull
>>
>> JavaObject NoEscape(NoEscape) NSR [ 756F 754F 1581F 1579F [ 319 324 532 1525 ]]   307	Allocate
>> LocalVar [ 307P [ 324 ]]   319	Proj ... #5
>> LocalVar [ 319 307P [ 532 1525 ]]   324	CheckCastPP  ... #valuetype* valhalla/vector/Long2$Value:NotNull
>> LocalVar [ 324 1264 1247P 307P [ 756b 754b ]]   532	Phi ... #valuetype* valhalla/vector/Long2$Value:NotNull
>> LocalVar [ 1264 324 1247P 307P [ 1581b 1579b ]]   1525	Phi ... #valuetype* valhalla/vector/Long2$Value:NotNull
>>
>> Scalar  1153	CheckCastPP ... #valuetype* valhalla/vector/Long2$Value:NotNull
>> ++++ Eliminated: 1136 Allocate
>> Scalar  1041	Allocate ...
>> ++++ Eliminated: 1041 Allocate
>> Scalar  650	CheckCastPP ... #valuetype* valhalla/vector/Long2$Value:NotNull
>> ++++ Eliminated: 633 Allocate
> 


More information about the valhalla-dev mailing list