MVT-based vectors: first stab
Vladimir Ivanov
vladimir.x.ivanov at oracle.com
Wed Jun 28 16:44:42 UTC 2017
Yes, the patch fixes the bug I hit in GWT & loop intrinsics. Thanks!
Best regards,
Vladimir Ivanov
On 6/28/17 7:07 PM, Maurizio Cimadamore wrote:
> Could you please try this patch and tell me if that helps for the
> liveness trouble?
>
> Cheers
> Maurizio
>
>
> On 28/06/17 16:20, Vladimir Ivanov wrote:
>>
>>
>> Best regards,
>> Vladimir Ivanov
>>
>> On 6/28/17 6:00 PM, Maurizio Cimadamore wrote:
>>>
>>>
>>> On 28/06/17 15:36, Vladimir Ivanov wrote:
>>>>
>>>>
>>>> On 6/28/17 5:08 PM, Maurizio Cimadamore wrote:
>>>>> Great work Vlad; I have two questions before I dive in more details
>>>>>
>>>>> * why did you need to handle arrayLength() specially? Since an
>>>>> array is a reference, shouldn't it work already?
>>>>
>>>> It's a workaround for C2 crashes I observed when working with
>>>> arraylength method handle on Q-typed arrays. The problem for the
>>>> compiler is that the implementation uses
>>>> ArrayAccessor.lengthL(Object[]), so on bytecode level there's a
>>>> conflict between declared & actual type (DVT[] doesn't extend
>>>> Object[]).
>>>>
>>>> IMO the proper fix would be to have "erased" version for such cases
>>>> (or replace all primitive array specializations with a single
>>>> "erased" version):
>>>>
>>>> static int lengthL(java.lang.Object);
>>>> 0: aload_0
>>>> 1: arraylength
>>>> 2: ireturn
>>>>
>>>> But it can't be expressed in Java language and requires bytecode
>>>> generation.
>>> OK - in other words, we need to specialize array length, because of
>>> the missing subtyping edge. Ugh.
>>
>> Yes. Moreover, I've just checked and the verifier complains about
>> non-array types on stack for arraylength instruction.
>>
>> Caused by: java.lang.VerifyError: Bad type on operand stack in
>> arraylength
>> Exception Details:
>> Location:
>> Test.length(Ljava/lang/Object;)I @1: arraylength
>> Reason:
>> Invalid type: 'java/lang/Object' (current frame, stack[0])
>> Current Frame:
>> bci: @1
>> flags: { }
>> locals: { 'java/lang/Object' }
>> stack: { 'java/lang/Object' }
>>
>> So, it seems longer term the only alternative to specialization is
>> Unsafe.
>>
>>>>> * Can you explain me exactly what you were trying to achieve with
>>>>> state.push/state.pop in the intrinsic generation? Typically the
>>>>> state is updated through opcodes - so, assuming you are generating
>>>>> bytecodes that make sense, you should not need this - unless there
>>>>> are bugs in the code builder (which is possible!)
>>>>
>>>> The following sequence usually hits the bug in the code builder:
>>>> builder.goto_("l1");
>>>> builder.label("l2");
>>>>
>>>> goto_() doesn't wipe the state, but the state at l2 doesn't relate
>>>> to the state when jumping to l1. It usually manifests as stack depth
>>>> or type mismatches when trying to merge the states in builder.label().
>>> I see - that's a well-known issue - the builder currently doesn't
>>> handle reachability. In your case, the builder should realize that
>>> code after 'goto' is dead, so it should just try to merge an empty
>>> state with the state of the code that is jumping to "l2" (of course
>>> this assumes that there's no other subsequent backward jump to "l2"
>>> otherwise the simple stackmap analysis implemented here would fail
>>> and you'll need some fix point calculation a la ASM).
>>
>> Yeah, backward jumps cause issues to one-pass solutions.
>>
>> Best regards,
>> Vladimir Ivanov
>>
>>>>> On 28/06/17 14:17, Vladimir Ivanov wrote:
>>>>>> Hi,
>>>>>>
>>>>>> I started playing with MVT-based vectors recently and want to
>>>>>> share my experience so far.
>>>>>>
>>>>>> For the first step, I decided to avoid specific treatment of
>>>>>> super-longs (Long2/4/8) in the JVM we have in Panama and just mark
>>>>>> them as VCCs for now:
>>>>>> @DeriveValueType
>>>>>> final class Long2 { public final long lo, hi; }
>>>>>>
>>>>>> As a test case I chose sum of array elements:
>>>>>>
>>>>>> long sum(QLong2[] va) {
>>>>>> QLong2 v = QLong2.default; // (0,0)
>>>>>> for (int i = 0; i < va.length; i++) {
>>>>>> QLong2 vi = va[i];
>>>>>> v = QLong2(v.lo + vi.lo, v.hi + vi.hi);
>>>>>> }
>>>>>> return v.lo + v.hi;
>>>>>> }
>>>>>>
>>>>>> Here are the changes:
>>>>>> http://cr.openjdk.java.net/~vlivanov/valhalla/vectors.mvt/webrev.00/
>>>>>>
>>>>>> SUMMARY & RESULTS
>>>>>>
>>>>>> I hit some bugs along the way and had to extend the code for
>>>>>> Q-types in some places to get decent code shape generated. (I'll
>>>>>> shepherd the fixes in the repo separately.) I ended up with the
>>>>>> following generated code:
>>>>>>
>>>>>> http://cr.openjdk.java.net/~vlivanov/valhalla/vectors.mvt/sumArrayL2.log
>>>>>>
>>>>>>
>>>>>> Unfortunately, there are still 2 allocations left in the generated
>>>>>> code (out of 5 initially). C2 can't scalarize the accumulator (v)
>>>>>> inside the loop and has to put updated components (lo & hi) in
>>>>>> allocated heap buffer on every iteration over the array (generated
>>>>>> code [1], C2 EA output [2]).
>>>>>>
>>>>>>
>>>>>> GORY DETAILS
>>>>>>
>>>>>> hotspot/src/share/vm/opto/type.cpp
>>>>>>
>>>>>> * Type::get_typeflow_type() doesn't distinguish between
>>>>>> TypeValueTypePtr & TypeValueType
>>>>>>
>>>>>>
>>>>>> jdk/src/java.base/share/classes/java/lang/invoke/LambdaFormBuilder.java
>>>>>>
>>>>>>
>>>>>> * introduced some LF intrinsics, e.g. LOOP and
>>>>>> SELECT_ALTERNATIVE;
>>>>>>
>>>>>> * had to workaround control flow-related bugs in the bytecode
>>>>>> library (jdk.experimental.bytecode): broken JVM state after
>>>>>> goto_() (see FIXME marks in the code);
>>>>>>
>>>>>> * fixed 2-slot types support: wrong local indexes were used
>>>>>>
>>>>>>
>>>>>>
>>>>>> jdk/src/java.base/share/classes/java/lang/invoke/MethodHandleImpl.java
>>>>>>
>>>>>>
>>>>>> * MHI.unboxResultHandle misses unboxing step for Q-types
>>>>>>
>>>>>>
>>>>>> jdk/src/java.base/share/classes/java/lang/invoke/MethodHandles.java
>>>>>> jdk/src/java.base/share/classes/jdk/experimental/value/ValueType.java
>>>>>>
>>>>>> * had to add a specialization for arraylength on Q-typed
>>>>>> array, because C2 can't handle arraylength Object[] when Q[]
>>>>>> instance is on stack;
>>>>>>
>>>>>>
>>>>>> jdk/src/java.base/share/classes/jdk/experimental/bytecode/TypedCodeBuilder.java:
>>>>>>
>>>>>>
>>>>>> * changes required to workaround goto_() bug
>>>>>>
>>>>>> * fixed CP patching support
>>>>>>
>>>>>>
>>>>>> jdk/src/java.base/share/classes/jdk/experimental/value/MethodHandleBuilder.java
>>>>>>
>>>>>>
>>>>>> * fixed CP patching support
>>>>>>
>>>>>>
>>>>>> hotspot/src/share/vm/memory/metachunk.hpp
>>>>>> hotspot/src/share/vm/oops/method.hpp
>>>>>> Fix optimized JVM build.
>>>>>>
>>>>>> Best regards,
>>>>>> Vladimir Ivanov
>>>>>>
>>>>>> [1]
>>>>>> http://cr.openjdk.java.net/~vlivanov/valhalla/vectors.mvt/sumArrayL2.log
>>>>>>
>>>>>>
>>>>>> ;; B7: # B8 <- B9 top-of-loop
>>>>>>
>>>>>> mov %r9,0x10(%rsp)
>>>>>> mov 0x8(%rsp),%r11d
>>>>>>
>>>>>> ;; B8: # B16 B9 <- B6 B7 Loop: B8-B7 inner
>>>>>>
>>>>>> mov %r11d,0x8(%rsp)
>>>>>> movabs $0x7c0060420,%rsi ; {metadata('valhalla/vector/Long2$Value')}
>>>>>> ...
>>>>>> callq 0x0000000117579820 ; {runtime_call _new_instance_Java}
>>>>>>
>>>>>> ;; B9: # B7 B10 <- B8
>>>>>>
>>>>>> mov %rax,%r9
>>>>>> mov 0x10(%rsp),%r11
>>>>>> mov 0x10(%r11),%r10
>>>>>> mov 0x18(%r11),%r11
>>>>>> movslq %ebp,%r8
>>>>>> shl $0x4,%r8
>>>>>> mov (%rsp),%rcx
>>>>>> mov 0x18(%rcx,%r8,1),%rax
>>>>>> mov 0x10(%rcx,%r8,1),%r8 ;*vaload
>>>>>> add %r11,%rax ;*ladd
>>>>>> mov %rax,0x18(%r9)
>>>>>> add %r10,%r8 ;*ladd
>>>>>> mov %r8,0x10(%r9)
>>>>>> inc %ebp ;*iadd
>>>>>> cmp 0x8(%rsp),%ebp
>>>>>> jl <B7> ;*if_icmpge
>>>>>>
>>>>>> [2] Pruned EA output (full output in the log)
>>>>>> ======== Connection graph for valhalla.vector.VectorTest::sumArrayL2
>>>>>> JavaObject NoEscape(NoEscape) NSR [ 1267F 1265F 756F 754F 1581F
>>>>>> 1579F [ 1259 1264 532 1525 ]] 1247 Allocate
>>>>>> LocalVar [ 1247P [ 1264 ]] 1259 Proj ... #5
>>>>>> LocalVar [ 1259 1247P [ 1267b 1265b 532 1525 ]] 1264 CheckCastPP
>>>>>> ... #valuetype* valhalla/vector/Long2$Value:NotNull
>>>>>> LocalVar [ 324 1264 1247P 307P [ 756b 754b ]] 532 Phi ...
>>>>>> #valuetype* valhalla/vector/Long2$Value:NotNull
>>>>>> LocalVar [ 1264 324 1247P 307P [ 1581b 1579b ]] 1525 Phi ...
>>>>>> #valuetype* valhalla/vector/Long2$Value:NotNull
>>>>>>
>>>>>> JavaObject NoEscape(NoEscape) [ 1154F [ 1148 1153 ]] 1136 Allocate
>>>>>> LocalVar [ 1136P [ 1153 ]] 1148 Proj ... #5
>>>>>> LocalVar [ 1148 1136P [ 1154b ]] 1153 CheckCastPP ...
>>>>>> #valuetype* valhalla/vector/Long2$Value:NotNull
>>>>>>
>>>>>> JavaObject NoEscape(NoEscape) [ [ 1053 ]] 1041 Allocate
>>>>>> LocalVar [ 1041P [ ]] 1053 Proj ... #5
>>>>>>
>>>>>> JavaObject NoEscape(NoEscape) [ 653F 651F [ 645 650 ]] 633 Allocate
>>>>>> LocalVar [ 633P [ 650 ]] 645 Proj ... #5
>>>>>> LocalVar [ 645 633P [ 653b 651b ]] 650 CheckCastPP ...
>>>>>> #valuetype* valhalla/vector/Long2$Value:NotNull
>>>>>>
>>>>>> JavaObject NoEscape(NoEscape) NSR [ 756F 754F 1581F 1579F [ 319
>>>>>> 324 532 1525 ]] 307 Allocate
>>>>>> LocalVar [ 307P [ 324 ]] 319 Proj ... #5
>>>>>> LocalVar [ 319 307P [ 532 1525 ]] 324 CheckCastPP ...
>>>>>> #valuetype* valhalla/vector/Long2$Value:NotNull
>>>>>> LocalVar [ 324 1264 1247P 307P [ 756b 754b ]] 532 Phi ...
>>>>>> #valuetype* valhalla/vector/Long2$Value:NotNull
>>>>>> LocalVar [ 1264 324 1247P 307P [ 1581b 1579b ]] 1525 Phi ...
>>>>>> #valuetype* valhalla/vector/Long2$Value:NotNull
>>>>>>
>>>>>> Scalar 1153 CheckCastPP ... #valuetype*
>>>>>> valhalla/vector/Long2$Value:NotNull
>>>>>> ++++ Eliminated: 1136 Allocate
>>>>>> Scalar 1041 Allocate ...
>>>>>> ++++ Eliminated: 1041 Allocate
>>>>>> Scalar 650 CheckCastPP ... #valuetype*
>>>>>> valhalla/vector/Long2$Value:NotNull
>>>>>> ++++ Eliminated: 633 Allocate
>>>>>
>>>
>
More information about the valhalla-dev
mailing list