MVT-based vectors: first stab

Maurizio Cimadamore maurizio.cimadamore at oracle.com
Wed Jun 28 15:00:31 UTC 2017



On 28/06/17 15:36, Vladimir Ivanov wrote:
>
>
> On 6/28/17 5:08 PM, Maurizio Cimadamore wrote:
>> Great work Vlad; I have two questions before I dive in more details
>>
>> * why did you need to handle arrayLength() specially? Since an array 
>> is a reference, shouldn't it work already?
>
> It's a workaround for C2 crashes I observed when working with 
> arraylength method handle on Q-typed arrays. The problem for the 
> compiler is that the implementation uses 
> ArrayAccessor.lengthL(Object[]), so on bytecode level there's a 
> conflict between declared & actual type (DVT[] doesn't extend Object[]).
>
> IMO the proper fix would be to have "erased" version for such cases 
> (or replace all primitive array specializations with a single "erased" 
> version):
>
>   static int lengthL(java.lang.Object);
>          0: aload_0
>          1: arraylength
>          2: ireturn
>
> But it can't be expressed in Java language and requires bytecode 
> generation.
OK - in other words, we need to specialize array length, because of the 
missing subtyping edge. Ugh.
>
>> * Can you explain me exactly what you were trying to achieve with 
>> state.push/state.pop in the intrinsic generation? Typically the state 
>> is updated through opcodes - so, assuming you are generating 
>> bytecodes that make sense, you should not need this - unless there 
>> are bugs in the code builder (which is possible!)
>
> The following sequence usually hits the bug in the code builder:
>   builder.goto_("l1");
>   builder.label("l2");
>
> goto_() doesn't wipe the state, but the state at l2 doesn't relate to 
> the state when jumping to l1. It usually manifests as stack depth or 
> type mismatches when trying to merge the states in builder.label().
I see - that's a well-known issue - the builder currently doesn't handle 
reachability. In your case, the builder should realize that code after 
'goto' is dead, so it should just try to merge an empty state with the 
state of the code that is jumping to "l2" (of course this assumes that 
there's no other subsequent backward jump to "l2" otherwise the simple 
stackmap analysis implemented here would fail and you'll need some fix 
point calculation a la ASM).

Maurizio
>
> Best regards,
> Vladimir Ivanov
>
>> On 28/06/17 14:17, Vladimir Ivanov wrote:
>>> Hi,
>>>
>>> I started playing with MVT-based vectors recently and want to share 
>>> my experience so far.
>>>
>>> For the first step, I decided to avoid specific treatment of 
>>> super-longs (Long2/4/8) in the JVM we have in Panama and just mark 
>>> them as VCCs for now:
>>>   @DeriveValueType
>>>   final class Long2 { public final long lo, hi; }
>>>
>>> As a test case I chose sum of array elements:
>>>
>>>   long sum(QLong2[] va) {
>>>     QLong2 v = QLong2.default; // (0,0)
>>>     for (int i = 0; i < va.length; i++) {
>>>       QLong2 vi = va[i];
>>>       v = QLong2(v.lo + vi.lo, v.hi + vi.hi);
>>>     }
>>>     return v.lo + v.hi;
>>>   }
>>>
>>> Here are the changes:
>>>  http://cr.openjdk.java.net/~vlivanov/valhalla/vectors.mvt/webrev.00/
>>>
>>> SUMMARY & RESULTS
>>>
>>> I hit some bugs along the way and had to extend the code for Q-types 
>>> in some places to get decent code shape generated. (I'll shepherd 
>>> the fixes in the repo separately.) I ended up with the following 
>>> generated code:
>>>
>>> http://cr.openjdk.java.net/~vlivanov/valhalla/vectors.mvt/sumArrayL2.log 
>>>
>>>
>>> Unfortunately, there are still 2 allocations left in the generated 
>>> code (out of 5 initially). C2 can't scalarize the accumulator (v) 
>>> inside the loop and has to put updated components (lo & hi) in 
>>> allocated heap buffer on every iteration over the array (generated 
>>> code [1], C2 EA output [2]).
>>>
>>>
>>> GORY DETAILS
>>>
>>>   hotspot/src/share/vm/opto/type.cpp
>>>
>>>     * Type::get_typeflow_type() doesn't distinguish between 
>>> TypeValueTypePtr & TypeValueType
>>>
>>>
>>> jdk/src/java.base/share/classes/java/lang/invoke/LambdaFormBuilder.java
>>>
>>>     * introduced some LF intrinsics, e.g. LOOP and SELECT_ALTERNATIVE;
>>>
>>>     * had to workaround control flow-related bugs in the bytecode 
>>> library (jdk.experimental.bytecode): broken JVM state after goto_() 
>>> (see FIXME marks in the code);
>>>
>>>     * fixed 2-slot types support: wrong local indexes were used
>>>
>>>
>>>
>>> jdk/src/java.base/share/classes/java/lang/invoke/MethodHandleImpl.java
>>>
>>>     * MHI.unboxResultHandle misses unboxing step for Q-types
>>>
>>>
>>> jdk/src/java.base/share/classes/java/lang/invoke/MethodHandles.java
>>> jdk/src/java.base/share/classes/jdk/experimental/value/ValueType.java
>>>
>>>     * had to add a specialization for arraylength on Q-typed array, 
>>> because C2 can't handle arraylength Object[] when Q[] instance is on 
>>> stack;
>>>
>>>
>>> jdk/src/java.base/share/classes/jdk/experimental/bytecode/TypedCodeBuilder.java: 
>>>
>>>
>>>     * changes required to workaround goto_() bug
>>>
>>>     * fixed CP patching support
>>>
>>>
>>> jdk/src/java.base/share/classes/jdk/experimental/value/MethodHandleBuilder.java 
>>>
>>>
>>>     * fixed CP patching support
>>>
>>>
>>>   hotspot/src/share/vm/memory/metachunk.hpp
>>>   hotspot/src/share/vm/oops/method.hpp
>>>     Fix optimized JVM build.
>>>
>>> Best regards,
>>> Vladimir Ivanov
>>>
>>> [1] 
>>> http://cr.openjdk.java.net/~vlivanov/valhalla/vectors.mvt/sumArrayL2.log 
>>>
>>>
>>> ;; B7: # B8 <- B9  top-of-loop
>>>
>>> mov    %r9,0x10(%rsp)
>>> mov    0x8(%rsp),%r11d
>>>
>>> ;; B8: # B16 B9 <- B6 B7   Loop: B8-B7 inner
>>>
>>> mov    %r11d,0x8(%rsp)
>>> movabs $0x7c0060420,%rsi  ; {metadata('valhalla/vector/Long2$Value')}
>>> ...
>>> callq  0x0000000117579820 ;   {runtime_call _new_instance_Java}
>>>
>>> ;; B9: # B7 B10 <- B8
>>>
>>> mov    %rax,%r9
>>> mov    0x10(%rsp),%r11
>>> mov    0x10(%r11),%r10
>>> mov    0x18(%r11),%r11
>>> movslq %ebp,%r8
>>> shl    $0x4,%r8
>>> mov    (%rsp),%rcx
>>> mov    0x18(%rcx,%r8,1),%rax
>>> mov    0x10(%rcx,%r8,1),%r8  ;*vaload
>>> add    %r11,%rax          ;*ladd
>>> mov    %rax,0x18(%r9)
>>> add    %r10,%r8           ;*ladd
>>> mov    %r8,0x10(%r9)
>>> inc    %ebp               ;*iadd
>>> cmp    0x8(%rsp),%ebp
>>> jl     <B7>               ;*if_icmpge
>>>
>>> [2] Pruned EA output (full output in the log)
>>> ======== Connection graph for valhalla.vector.VectorTest::sumArrayL2
>>> JavaObject NoEscape(NoEscape) NSR [ 1267F 1265F 756F 754F 1581F 
>>> 1579F [ 1259 1264 532 1525 ]]   1247    Allocate
>>> LocalVar [ 1247P [ 1264 ]]   1259    Proj ... #5
>>> LocalVar [ 1259 1247P [ 1267b 1265b 532 1525 ]]   1264 CheckCastPP 
>>> ... #valuetype* valhalla/vector/Long2$Value:NotNull
>>> LocalVar [ 324 1264 1247P 307P [ 756b 754b ]]   532    Phi ... 
>>> #valuetype* valhalla/vector/Long2$Value:NotNull
>>> LocalVar [ 1264 324 1247P 307P [ 1581b 1579b ]]   1525    Phi ... 
>>> #valuetype* valhalla/vector/Long2$Value:NotNull
>>>
>>> JavaObject NoEscape(NoEscape) [ 1154F [ 1148 1153 ]]   1136 Allocate
>>> LocalVar [ 1136P [ 1153 ]]   1148    Proj ... #5
>>> LocalVar [ 1148 1136P [ 1154b ]]   1153    CheckCastPP ... 
>>> #valuetype* valhalla/vector/Long2$Value:NotNull
>>>
>>> JavaObject NoEscape(NoEscape) [ [ 1053 ]]   1041    Allocate
>>> LocalVar [ 1041P [ ]]   1053    Proj ... #5
>>>
>>> JavaObject NoEscape(NoEscape) [ 653F 651F [ 645 650 ]]   633 Allocate
>>> LocalVar [ 633P [ 650 ]]   645    Proj ... #5
>>> LocalVar [ 645 633P [ 653b 651b ]]   650    CheckCastPP ... 
>>> #valuetype* valhalla/vector/Long2$Value:NotNull
>>>
>>> JavaObject NoEscape(NoEscape) NSR [ 756F 754F 1581F 1579F [ 319 324 
>>> 532 1525 ]]   307    Allocate
>>> LocalVar [ 307P [ 324 ]]   319    Proj ... #5
>>> LocalVar [ 319 307P [ 532 1525 ]]   324    CheckCastPP  ... 
>>> #valuetype* valhalla/vector/Long2$Value:NotNull
>>> LocalVar [ 324 1264 1247P 307P [ 756b 754b ]]   532    Phi ... 
>>> #valuetype* valhalla/vector/Long2$Value:NotNull
>>> LocalVar [ 1264 324 1247P 307P [ 1581b 1579b ]]   1525    Phi ... 
>>> #valuetype* valhalla/vector/Long2$Value:NotNull
>>>
>>> Scalar  1153    CheckCastPP ... #valuetype* 
>>> valhalla/vector/Long2$Value:NotNull
>>> ++++ Eliminated: 1136 Allocate
>>> Scalar  1041    Allocate ...
>>> ++++ Eliminated: 1041 Allocate
>>> Scalar  650    CheckCastPP ... #valuetype* 
>>> valhalla/vector/Long2$Value:NotNull
>>> ++++ Eliminated: 633 Allocate
>>




More information about the valhalla-dev mailing list