MVT-based vectors: first stab
Maurizio Cimadamore
maurizio.cimadamore at oracle.com
Wed Jun 28 14:08:13 UTC 2017
Great work Vlad; I have two questions before I dive in more details
* why did you need to handle arrayLength() specially? Since an array is
a reference, shouldn't it work already?
* Can you explain me exactly what you were trying to achieve with
state.push/state.pop in the intrinsic generation? Typically the state is
updated through opcodes - so, assuming you are generating bytecodes that
make sense, you should not need this - unless there are bugs in the code
builder (which is possible!)
Maurizio
On 28/06/17 14:17, Vladimir Ivanov wrote:
> Hi,
>
> I started playing with MVT-based vectors recently and want to share my
> experience so far.
>
> For the first step, I decided to avoid specific treatment of
> super-longs (Long2/4/8) in the JVM we have in Panama and just mark
> them as VCCs for now:
> @DeriveValueType
> final class Long2 { public final long lo, hi; }
>
> As a test case I chose sum of array elements:
>
> long sum(QLong2[] va) {
> QLong2 v = QLong2.default; // (0,0)
> for (int i = 0; i < va.length; i++) {
> QLong2 vi = va[i];
> v = QLong2(v.lo + vi.lo, v.hi + vi.hi);
> }
> return v.lo + v.hi;
> }
>
> Here are the changes:
> http://cr.openjdk.java.net/~vlivanov/valhalla/vectors.mvt/webrev.00/
>
> SUMMARY & RESULTS
>
> I hit some bugs along the way and had to extend the code for Q-types
> in some places to get decent code shape generated. (I'll shepherd the
> fixes in the repo separately.) I ended up with the following generated
> code:
>
> http://cr.openjdk.java.net/~vlivanov/valhalla/vectors.mvt/sumArrayL2.log
>
> Unfortunately, there are still 2 allocations left in the generated
> code (out of 5 initially). C2 can't scalarize the accumulator (v)
> inside the loop and has to put updated components (lo & hi) in
> allocated heap buffer on every iteration over the array (generated
> code [1], C2 EA output [2]).
>
>
> GORY DETAILS
>
> hotspot/src/share/vm/opto/type.cpp
>
> * Type::get_typeflow_type() doesn't distinguish between
> TypeValueTypePtr & TypeValueType
>
>
> jdk/src/java.base/share/classes/java/lang/invoke/LambdaFormBuilder.java
>
> * introduced some LF intrinsics, e.g. LOOP and SELECT_ALTERNATIVE;
>
> * had to workaround control flow-related bugs in the bytecode
> library (jdk.experimental.bytecode): broken JVM state after goto_()
> (see FIXME marks in the code);
>
> * fixed 2-slot types support: wrong local indexes were used
>
>
>
> jdk/src/java.base/share/classes/java/lang/invoke/MethodHandleImpl.java
>
> * MHI.unboxResultHandle misses unboxing step for Q-types
>
>
> jdk/src/java.base/share/classes/java/lang/invoke/MethodHandles.java
> jdk/src/java.base/share/classes/jdk/experimental/value/ValueType.java
>
> * had to add a specialization for arraylength on Q-typed array,
> because C2 can't handle arraylength Object[] when Q[] instance is on
> stack;
>
>
> jdk/src/java.base/share/classes/jdk/experimental/bytecode/TypedCodeBuilder.java:
>
>
> * changes required to workaround goto_() bug
>
> * fixed CP patching support
>
>
> jdk/src/java.base/share/classes/jdk/experimental/value/MethodHandleBuilder.java
>
>
> * fixed CP patching support
>
>
> hotspot/src/share/vm/memory/metachunk.hpp
> hotspot/src/share/vm/oops/method.hpp
> Fix optimized JVM build.
>
> Best regards,
> Vladimir Ivanov
>
> [1]
> http://cr.openjdk.java.net/~vlivanov/valhalla/vectors.mvt/sumArrayL2.log
>
> ;; B7: # B8 <- B9 top-of-loop
>
> mov %r9,0x10(%rsp)
> mov 0x8(%rsp),%r11d
>
> ;; B8: # B16 B9 <- B6 B7 Loop: B8-B7 inner
>
> mov %r11d,0x8(%rsp)
> movabs $0x7c0060420,%rsi ; {metadata('valhalla/vector/Long2$Value')}
> ...
> callq 0x0000000117579820 ; {runtime_call _new_instance_Java}
>
> ;; B9: # B7 B10 <- B8
>
> mov %rax,%r9
> mov 0x10(%rsp),%r11
> mov 0x10(%r11),%r10
> mov 0x18(%r11),%r11
> movslq %ebp,%r8
> shl $0x4,%r8
> mov (%rsp),%rcx
> mov 0x18(%rcx,%r8,1),%rax
> mov 0x10(%rcx,%r8,1),%r8 ;*vaload
> add %r11,%rax ;*ladd
> mov %rax,0x18(%r9)
> add %r10,%r8 ;*ladd
> mov %r8,0x10(%r9)
> inc %ebp ;*iadd
> cmp 0x8(%rsp),%ebp
> jl <B7> ;*if_icmpge
>
> [2] Pruned EA output (full output in the log)
> ======== Connection graph for valhalla.vector.VectorTest::sumArrayL2
> JavaObject NoEscape(NoEscape) NSR [ 1267F 1265F 756F 754F 1581F 1579F
> [ 1259 1264 532 1525 ]] 1247 Allocate
> LocalVar [ 1247P [ 1264 ]] 1259 Proj ... #5
> LocalVar [ 1259 1247P [ 1267b 1265b 532 1525 ]] 1264 CheckCastPP ...
> #valuetype* valhalla/vector/Long2$Value:NotNull
> LocalVar [ 324 1264 1247P 307P [ 756b 754b ]] 532 Phi ...
> #valuetype* valhalla/vector/Long2$Value:NotNull
> LocalVar [ 1264 324 1247P 307P [ 1581b 1579b ]] 1525 Phi ...
> #valuetype* valhalla/vector/Long2$Value:NotNull
>
> JavaObject NoEscape(NoEscape) [ 1154F [ 1148 1153 ]] 1136 Allocate
> LocalVar [ 1136P [ 1153 ]] 1148 Proj ... #5
> LocalVar [ 1148 1136P [ 1154b ]] 1153 CheckCastPP ... #valuetype*
> valhalla/vector/Long2$Value:NotNull
>
> JavaObject NoEscape(NoEscape) [ [ 1053 ]] 1041 Allocate
> LocalVar [ 1041P [ ]] 1053 Proj ... #5
>
> JavaObject NoEscape(NoEscape) [ 653F 651F [ 645 650 ]] 633 Allocate
> LocalVar [ 633P [ 650 ]] 645 Proj ... #5
> LocalVar [ 645 633P [ 653b 651b ]] 650 CheckCastPP ...
> #valuetype* valhalla/vector/Long2$Value:NotNull
>
> JavaObject NoEscape(NoEscape) NSR [ 756F 754F 1581F 1579F [ 319 324
> 532 1525 ]] 307 Allocate
> LocalVar [ 307P [ 324 ]] 319 Proj ... #5
> LocalVar [ 319 307P [ 532 1525 ]] 324 CheckCastPP ...
> #valuetype* valhalla/vector/Long2$Value:NotNull
> LocalVar [ 324 1264 1247P 307P [ 756b 754b ]] 532 Phi ...
> #valuetype* valhalla/vector/Long2$Value:NotNull
> LocalVar [ 1264 324 1247P 307P [ 1581b 1579b ]] 1525 Phi ...
> #valuetype* valhalla/vector/Long2$Value:NotNull
>
> Scalar 1153 CheckCastPP ... #valuetype*
> valhalla/vector/Long2$Value:NotNull
> ++++ Eliminated: 1136 Allocate
> Scalar 1041 Allocate ...
> ++++ Eliminated: 1041 Allocate
> Scalar 650 CheckCastPP ... #valuetype*
> valhalla/vector/Long2$Value:NotNull
> ++++ Eliminated: 633 Allocate
More information about the valhalla-dev
mailing list