MVT-based vectors: first stab
Maurizio Cimadamore
maurizio.cimadamore at oracle.com
Wed Jun 28 17:10:25 UTC 2017
Cool! Feel free to include it in your MVT patch - so that we can avoid
granting access to TypedCodeBuilder.state
Maurizio
On 28/06/17 17:44, Vladimir Ivanov wrote:
> Yes, the patch fixes the bug I hit in GWT & loop intrinsics. Thanks!
>
> Best regards,
> Vladimir Ivanov
>
> On 6/28/17 7:07 PM, Maurizio Cimadamore wrote:
>> Could you please try this patch and tell me if that helps for the
>> liveness trouble?
>>
>> Cheers
>> Maurizio
>>
>>
>> On 28/06/17 16:20, Vladimir Ivanov wrote:
>>>
>>>
>>> Best regards,
>>> Vladimir Ivanov
>>>
>>> On 6/28/17 6:00 PM, Maurizio Cimadamore wrote:
>>>>
>>>>
>>>> On 28/06/17 15:36, Vladimir Ivanov wrote:
>>>>>
>>>>>
>>>>> On 6/28/17 5:08 PM, Maurizio Cimadamore wrote:
>>>>>> Great work Vlad; I have two questions before I dive in more details
>>>>>>
>>>>>> * why did you need to handle arrayLength() specially? Since an
>>>>>> array is a reference, shouldn't it work already?
>>>>>
>>>>> It's a workaround for C2 crashes I observed when working with
>>>>> arraylength method handle on Q-typed arrays. The problem for the
>>>>> compiler is that the implementation uses
>>>>> ArrayAccessor.lengthL(Object[]), so on bytecode level there's a
>>>>> conflict between declared & actual type (DVT[] doesn't extend
>>>>> Object[]).
>>>>>
>>>>> IMO the proper fix would be to have "erased" version for such
>>>>> cases (or replace all primitive array specializations with a
>>>>> single "erased" version):
>>>>>
>>>>> static int lengthL(java.lang.Object);
>>>>> 0: aload_0
>>>>> 1: arraylength
>>>>> 2: ireturn
>>>>>
>>>>> But it can't be expressed in Java language and requires bytecode
>>>>> generation.
>>>> OK - in other words, we need to specialize array length, because of
>>>> the missing subtyping edge. Ugh.
>>>
>>> Yes. Moreover, I've just checked and the verifier complains about
>>> non-array types on stack for arraylength instruction.
>>>
>>> Caused by: java.lang.VerifyError: Bad type on operand stack in
>>> arraylength
>>> Exception Details:
>>> Location:
>>> Test.length(Ljava/lang/Object;)I @1: arraylength
>>> Reason:
>>> Invalid type: 'java/lang/Object' (current frame, stack[0])
>>> Current Frame:
>>> bci: @1
>>> flags: { }
>>> locals: { 'java/lang/Object' }
>>> stack: { 'java/lang/Object' }
>>>
>>> So, it seems longer term the only alternative to specialization is
>>> Unsafe.
>>>
>>>>>> * Can you explain me exactly what you were trying to achieve with
>>>>>> state.push/state.pop in the intrinsic generation? Typically the
>>>>>> state is updated through opcodes - so, assuming you are
>>>>>> generating bytecodes that make sense, you should not need this -
>>>>>> unless there are bugs in the code builder (which is possible!)
>>>>>
>>>>> The following sequence usually hits the bug in the code builder:
>>>>> builder.goto_("l1");
>>>>> builder.label("l2");
>>>>>
>>>>> goto_() doesn't wipe the state, but the state at l2 doesn't relate
>>>>> to the state when jumping to l1. It usually manifests as stack
>>>>> depth or type mismatches when trying to merge the states in
>>>>> builder.label().
>>>> I see - that's a well-known issue - the builder currently doesn't
>>>> handle reachability. In your case, the builder should realize that
>>>> code after 'goto' is dead, so it should just try to merge an empty
>>>> state with the state of the code that is jumping to "l2" (of course
>>>> this assumes that there's no other subsequent backward jump to "l2"
>>>> otherwise the simple stackmap analysis implemented here would fail
>>>> and you'll need some fix point calculation a la ASM).
>>>
>>> Yeah, backward jumps cause issues to one-pass solutions.
>>>
>>> Best regards,
>>> Vladimir Ivanov
>>>
>>>>>> On 28/06/17 14:17, Vladimir Ivanov wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> I started playing with MVT-based vectors recently and want to
>>>>>>> share my experience so far.
>>>>>>>
>>>>>>> For the first step, I decided to avoid specific treatment of
>>>>>>> super-longs (Long2/4/8) in the JVM we have in Panama and just
>>>>>>> mark them as VCCs for now:
>>>>>>> @DeriveValueType
>>>>>>> final class Long2 { public final long lo, hi; }
>>>>>>>
>>>>>>> As a test case I chose sum of array elements:
>>>>>>>
>>>>>>> long sum(QLong2[] va) {
>>>>>>> QLong2 v = QLong2.default; // (0,0)
>>>>>>> for (int i = 0; i < va.length; i++) {
>>>>>>> QLong2 vi = va[i];
>>>>>>> v = QLong2(v.lo + vi.lo, v.hi + vi.hi);
>>>>>>> }
>>>>>>> return v.lo + v.hi;
>>>>>>> }
>>>>>>>
>>>>>>> Here are the changes:
>>>>>>> http://cr.openjdk.java.net/~vlivanov/valhalla/vectors.mvt/webrev.00/
>>>>>>>
>>>>>>>
>>>>>>> SUMMARY & RESULTS
>>>>>>>
>>>>>>> I hit some bugs along the way and had to extend the code for
>>>>>>> Q-types in some places to get decent code shape generated. (I'll
>>>>>>> shepherd the fixes in the repo separately.) I ended up with the
>>>>>>> following generated code:
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~vlivanov/valhalla/vectors.mvt/sumArrayL2.log
>>>>>>>
>>>>>>>
>>>>>>> Unfortunately, there are still 2 allocations left in the
>>>>>>> generated code (out of 5 initially). C2 can't scalarize the
>>>>>>> accumulator (v) inside the loop and has to put updated
>>>>>>> components (lo & hi) in allocated heap buffer on every iteration
>>>>>>> over the array (generated code [1], C2 EA output [2]).
>>>>>>>
>>>>>>>
>>>>>>> GORY DETAILS
>>>>>>>
>>>>>>> hotspot/src/share/vm/opto/type.cpp
>>>>>>>
>>>>>>> * Type::get_typeflow_type() doesn't distinguish between
>>>>>>> TypeValueTypePtr & TypeValueType
>>>>>>>
>>>>>>>
>>>>>>> jdk/src/java.base/share/classes/java/lang/invoke/LambdaFormBuilder.java
>>>>>>>
>>>>>>>
>>>>>>> * introduced some LF intrinsics, e.g. LOOP and
>>>>>>> SELECT_ALTERNATIVE;
>>>>>>>
>>>>>>> * had to workaround control flow-related bugs in the
>>>>>>> bytecode library (jdk.experimental.bytecode): broken JVM state
>>>>>>> after goto_() (see FIXME marks in the code);
>>>>>>>
>>>>>>> * fixed 2-slot types support: wrong local indexes were used
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> jdk/src/java.base/share/classes/java/lang/invoke/MethodHandleImpl.java
>>>>>>>
>>>>>>>
>>>>>>> * MHI.unboxResultHandle misses unboxing step for Q-types
>>>>>>>
>>>>>>>
>>>>>>> jdk/src/java.base/share/classes/java/lang/invoke/MethodHandles.java
>>>>>>> jdk/src/java.base/share/classes/jdk/experimental/value/ValueType.java
>>>>>>>
>>>>>>>
>>>>>>> * had to add a specialization for arraylength on Q-typed
>>>>>>> array, because C2 can't handle arraylength Object[] when Q[]
>>>>>>> instance is on stack;
>>>>>>>
>>>>>>>
>>>>>>> jdk/src/java.base/share/classes/jdk/experimental/bytecode/TypedCodeBuilder.java:
>>>>>>>
>>>>>>>
>>>>>>> * changes required to workaround goto_() bug
>>>>>>>
>>>>>>> * fixed CP patching support
>>>>>>>
>>>>>>>
>>>>>>> jdk/src/java.base/share/classes/jdk/experimental/value/MethodHandleBuilder.java
>>>>>>>
>>>>>>>
>>>>>>> * fixed CP patching support
>>>>>>>
>>>>>>>
>>>>>>> hotspot/src/share/vm/memory/metachunk.hpp
>>>>>>> hotspot/src/share/vm/oops/method.hpp
>>>>>>> Fix optimized JVM build.
>>>>>>>
>>>>>>> Best regards,
>>>>>>> Vladimir Ivanov
>>>>>>>
>>>>>>> [1]
>>>>>>> http://cr.openjdk.java.net/~vlivanov/valhalla/vectors.mvt/sumArrayL2.log
>>>>>>>
>>>>>>>
>>>>>>> ;; B7: # B8 <- B9 top-of-loop
>>>>>>>
>>>>>>> mov %r9,0x10(%rsp)
>>>>>>> mov 0x8(%rsp),%r11d
>>>>>>>
>>>>>>> ;; B8: # B16 B9 <- B6 B7 Loop: B8-B7 inner
>>>>>>>
>>>>>>> mov %r11d,0x8(%rsp)
>>>>>>> movabs $0x7c0060420,%rsi ;
>>>>>>> {metadata('valhalla/vector/Long2$Value')}
>>>>>>> ...
>>>>>>> callq 0x0000000117579820 ; {runtime_call _new_instance_Java}
>>>>>>>
>>>>>>> ;; B9: # B7 B10 <- B8
>>>>>>>
>>>>>>> mov %rax,%r9
>>>>>>> mov 0x10(%rsp),%r11
>>>>>>> mov 0x10(%r11),%r10
>>>>>>> mov 0x18(%r11),%r11
>>>>>>> movslq %ebp,%r8
>>>>>>> shl $0x4,%r8
>>>>>>> mov (%rsp),%rcx
>>>>>>> mov 0x18(%rcx,%r8,1),%rax
>>>>>>> mov 0x10(%rcx,%r8,1),%r8 ;*vaload
>>>>>>> add %r11,%rax ;*ladd
>>>>>>> mov %rax,0x18(%r9)
>>>>>>> add %r10,%r8 ;*ladd
>>>>>>> mov %r8,0x10(%r9)
>>>>>>> inc %ebp ;*iadd
>>>>>>> cmp 0x8(%rsp),%ebp
>>>>>>> jl <B7> ;*if_icmpge
>>>>>>>
>>>>>>> [2] Pruned EA output (full output in the log)
>>>>>>> ======== Connection graph for
>>>>>>> valhalla.vector.VectorTest::sumArrayL2
>>>>>>> JavaObject NoEscape(NoEscape) NSR [ 1267F 1265F 756F 754F 1581F
>>>>>>> 1579F [ 1259 1264 532 1525 ]] 1247 Allocate
>>>>>>> LocalVar [ 1247P [ 1264 ]] 1259 Proj ... #5
>>>>>>> LocalVar [ 1259 1247P [ 1267b 1265b 532 1525 ]] 1264
>>>>>>> CheckCastPP ... #valuetype* valhalla/vector/Long2$Value:NotNull
>>>>>>> LocalVar [ 324 1264 1247P 307P [ 756b 754b ]] 532 Phi ...
>>>>>>> #valuetype* valhalla/vector/Long2$Value:NotNull
>>>>>>> LocalVar [ 1264 324 1247P 307P [ 1581b 1579b ]] 1525 Phi ...
>>>>>>> #valuetype* valhalla/vector/Long2$Value:NotNull
>>>>>>>
>>>>>>> JavaObject NoEscape(NoEscape) [ 1154F [ 1148 1153 ]] 1136 Allocate
>>>>>>> LocalVar [ 1136P [ 1153 ]] 1148 Proj ... #5
>>>>>>> LocalVar [ 1148 1136P [ 1154b ]] 1153 CheckCastPP ...
>>>>>>> #valuetype* valhalla/vector/Long2$Value:NotNull
>>>>>>>
>>>>>>> JavaObject NoEscape(NoEscape) [ [ 1053 ]] 1041 Allocate
>>>>>>> LocalVar [ 1041P [ ]] 1053 Proj ... #5
>>>>>>>
>>>>>>> JavaObject NoEscape(NoEscape) [ 653F 651F [ 645 650 ]] 633 Allocate
>>>>>>> LocalVar [ 633P [ 650 ]] 645 Proj ... #5
>>>>>>> LocalVar [ 645 633P [ 653b 651b ]] 650 CheckCastPP ...
>>>>>>> #valuetype* valhalla/vector/Long2$Value:NotNull
>>>>>>>
>>>>>>> JavaObject NoEscape(NoEscape) NSR [ 756F 754F 1581F 1579F [ 319
>>>>>>> 324 532 1525 ]] 307 Allocate
>>>>>>> LocalVar [ 307P [ 324 ]] 319 Proj ... #5
>>>>>>> LocalVar [ 319 307P [ 532 1525 ]] 324 CheckCastPP ...
>>>>>>> #valuetype* valhalla/vector/Long2$Value:NotNull
>>>>>>> LocalVar [ 324 1264 1247P 307P [ 756b 754b ]] 532 Phi ...
>>>>>>> #valuetype* valhalla/vector/Long2$Value:NotNull
>>>>>>> LocalVar [ 1264 324 1247P 307P [ 1581b 1579b ]] 1525 Phi ...
>>>>>>> #valuetype* valhalla/vector/Long2$Value:NotNull
>>>>>>>
>>>>>>> Scalar 1153 CheckCastPP ... #valuetype*
>>>>>>> valhalla/vector/Long2$Value:NotNull
>>>>>>> ++++ Eliminated: 1136 Allocate
>>>>>>> Scalar 1041 Allocate ...
>>>>>>> ++++ Eliminated: 1041 Allocate
>>>>>>> Scalar 650 CheckCastPP ... #valuetype*
>>>>>>> valhalla/vector/Long2$Value:NotNull
>>>>>>> ++++ Eliminated: 633 Allocate
>>>>>>
>>>>
>>
More information about the valhalla-dev
mailing list