MVT-based vectors: first stab
Vladimir Ivanov
vladimir.x.ivanov at oracle.com
Wed Jun 28 15:20:50 UTC 2017
Best regards,
Vladimir Ivanov
On 6/28/17 6:00 PM, Maurizio Cimadamore wrote:
>
>
> On 28/06/17 15:36, Vladimir Ivanov wrote:
>>
>>
>> On 6/28/17 5:08 PM, Maurizio Cimadamore wrote:
>>> Great work Vlad; I have two questions before I dive in more details
>>>
>>> * why did you need to handle arrayLength() specially? Since an array
>>> is a reference, shouldn't it work already?
>>
>> It's a workaround for C2 crashes I observed when working with
>> arraylength method handle on Q-typed arrays. The problem for the
>> compiler is that the implementation uses
>> ArrayAccessor.lengthL(Object[]), so on bytecode level there's a
>> conflict between declared & actual type (DVT[] doesn't extend Object[]).
>>
>> IMO the proper fix would be to have "erased" version for such cases
>> (or replace all primitive array specializations with a single "erased"
>> version):
>>
>> static int lengthL(java.lang.Object);
>> 0: aload_0
>> 1: arraylength
>> 2: ireturn
>>
>> But it can't be expressed in Java language and requires bytecode
>> generation.
> OK - in other words, we need to specialize array length, because of the
> missing subtyping edge. Ugh.
Yes. Moreover, I've just checked and the verifier complains about
non-array types on stack for arraylength instruction.
Caused by: java.lang.VerifyError: Bad type on operand stack in arraylength
Exception Details:
Location:
Test.length(Ljava/lang/Object;)I @1: arraylength
Reason:
Invalid type: 'java/lang/Object' (current frame, stack[0])
Current Frame:
bci: @1
flags: { }
locals: { 'java/lang/Object' }
stack: { 'java/lang/Object' }
So, it seems longer term the only alternative to specialization is Unsafe.
>>> * Can you explain me exactly what you were trying to achieve with
>>> state.push/state.pop in the intrinsic generation? Typically the state
>>> is updated through opcodes - so, assuming you are generating
>>> bytecodes that make sense, you should not need this - unless there
>>> are bugs in the code builder (which is possible!)
>>
>> The following sequence usually hits the bug in the code builder:
>> builder.goto_("l1");
>> builder.label("l2");
>>
>> goto_() doesn't wipe the state, but the state at l2 doesn't relate to
>> the state when jumping to l1. It usually manifests as stack depth or
>> type mismatches when trying to merge the states in builder.label().
> I see - that's a well-known issue - the builder currently doesn't handle
> reachability. In your case, the builder should realize that code after
> 'goto' is dead, so it should just try to merge an empty state with the
> state of the code that is jumping to "l2" (of course this assumes that
> there's no other subsequent backward jump to "l2" otherwise the simple
> stackmap analysis implemented here would fail and you'll need some fix
> point calculation a la ASM).
Yeah, backward jumps cause issues to one-pass solutions.
Best regards,
Vladimir Ivanov
>>> On 28/06/17 14:17, Vladimir Ivanov wrote:
>>>> Hi,
>>>>
>>>> I started playing with MVT-based vectors recently and want to share
>>>> my experience so far.
>>>>
>>>> For the first step, I decided to avoid specific treatment of
>>>> super-longs (Long2/4/8) in the JVM we have in Panama and just mark
>>>> them as VCCs for now:
>>>> @DeriveValueType
>>>> final class Long2 { public final long lo, hi; }
>>>>
>>>> As a test case I chose sum of array elements:
>>>>
>>>> long sum(QLong2[] va) {
>>>> QLong2 v = QLong2.default; // (0,0)
>>>> for (int i = 0; i < va.length; i++) {
>>>> QLong2 vi = va[i];
>>>> v = QLong2(v.lo + vi.lo, v.hi + vi.hi);
>>>> }
>>>> return v.lo + v.hi;
>>>> }
>>>>
>>>> Here are the changes:
>>>> http://cr.openjdk.java.net/~vlivanov/valhalla/vectors.mvt/webrev.00/
>>>>
>>>> SUMMARY & RESULTS
>>>>
>>>> I hit some bugs along the way and had to extend the code for Q-types
>>>> in some places to get decent code shape generated. (I'll shepherd
>>>> the fixes in the repo separately.) I ended up with the following
>>>> generated code:
>>>>
>>>> http://cr.openjdk.java.net/~vlivanov/valhalla/vectors.mvt/sumArrayL2.log
>>>>
>>>>
>>>> Unfortunately, there are still 2 allocations left in the generated
>>>> code (out of 5 initially). C2 can't scalarize the accumulator (v)
>>>> inside the loop and has to put updated components (lo & hi) in
>>>> allocated heap buffer on every iteration over the array (generated
>>>> code [1], C2 EA output [2]).
>>>>
>>>>
>>>> GORY DETAILS
>>>>
>>>> hotspot/src/share/vm/opto/type.cpp
>>>>
>>>> * Type::get_typeflow_type() doesn't distinguish between
>>>> TypeValueTypePtr & TypeValueType
>>>>
>>>>
>>>> jdk/src/java.base/share/classes/java/lang/invoke/LambdaFormBuilder.java
>>>>
>>>> * introduced some LF intrinsics, e.g. LOOP and SELECT_ALTERNATIVE;
>>>>
>>>> * had to workaround control flow-related bugs in the bytecode
>>>> library (jdk.experimental.bytecode): broken JVM state after goto_()
>>>> (see FIXME marks in the code);
>>>>
>>>> * fixed 2-slot types support: wrong local indexes were used
>>>>
>>>>
>>>>
>>>> jdk/src/java.base/share/classes/java/lang/invoke/MethodHandleImpl.java
>>>>
>>>> * MHI.unboxResultHandle misses unboxing step for Q-types
>>>>
>>>>
>>>> jdk/src/java.base/share/classes/java/lang/invoke/MethodHandles.java
>>>> jdk/src/java.base/share/classes/jdk/experimental/value/ValueType.java
>>>>
>>>> * had to add a specialization for arraylength on Q-typed array,
>>>> because C2 can't handle arraylength Object[] when Q[] instance is on
>>>> stack;
>>>>
>>>>
>>>> jdk/src/java.base/share/classes/jdk/experimental/bytecode/TypedCodeBuilder.java:
>>>>
>>>>
>>>> * changes required to workaround goto_() bug
>>>>
>>>> * fixed CP patching support
>>>>
>>>>
>>>> jdk/src/java.base/share/classes/jdk/experimental/value/MethodHandleBuilder.java
>>>>
>>>>
>>>> * fixed CP patching support
>>>>
>>>>
>>>> hotspot/src/share/vm/memory/metachunk.hpp
>>>> hotspot/src/share/vm/oops/method.hpp
>>>> Fix optimized JVM build.
>>>>
>>>> Best regards,
>>>> Vladimir Ivanov
>>>>
>>>> [1]
>>>> http://cr.openjdk.java.net/~vlivanov/valhalla/vectors.mvt/sumArrayL2.log
>>>>
>>>>
>>>> ;; B7: # B8 <- B9 top-of-loop
>>>>
>>>> mov %r9,0x10(%rsp)
>>>> mov 0x8(%rsp),%r11d
>>>>
>>>> ;; B8: # B16 B9 <- B6 B7 Loop: B8-B7 inner
>>>>
>>>> mov %r11d,0x8(%rsp)
>>>> movabs $0x7c0060420,%rsi ; {metadata('valhalla/vector/Long2$Value')}
>>>> ...
>>>> callq 0x0000000117579820 ; {runtime_call _new_instance_Java}
>>>>
>>>> ;; B9: # B7 B10 <- B8
>>>>
>>>> mov %rax,%r9
>>>> mov 0x10(%rsp),%r11
>>>> mov 0x10(%r11),%r10
>>>> mov 0x18(%r11),%r11
>>>> movslq %ebp,%r8
>>>> shl $0x4,%r8
>>>> mov (%rsp),%rcx
>>>> mov 0x18(%rcx,%r8,1),%rax
>>>> mov 0x10(%rcx,%r8,1),%r8 ;*vaload
>>>> add %r11,%rax ;*ladd
>>>> mov %rax,0x18(%r9)
>>>> add %r10,%r8 ;*ladd
>>>> mov %r8,0x10(%r9)
>>>> inc %ebp ;*iadd
>>>> cmp 0x8(%rsp),%ebp
>>>> jl <B7> ;*if_icmpge
>>>>
>>>> [2] Pruned EA output (full output in the log)
>>>> ======== Connection graph for valhalla.vector.VectorTest::sumArrayL2
>>>> JavaObject NoEscape(NoEscape) NSR [ 1267F 1265F 756F 754F 1581F
>>>> 1579F [ 1259 1264 532 1525 ]] 1247 Allocate
>>>> LocalVar [ 1247P [ 1264 ]] 1259 Proj ... #5
>>>> LocalVar [ 1259 1247P [ 1267b 1265b 532 1525 ]] 1264 CheckCastPP
>>>> ... #valuetype* valhalla/vector/Long2$Value:NotNull
>>>> LocalVar [ 324 1264 1247P 307P [ 756b 754b ]] 532 Phi ...
>>>> #valuetype* valhalla/vector/Long2$Value:NotNull
>>>> LocalVar [ 1264 324 1247P 307P [ 1581b 1579b ]] 1525 Phi ...
>>>> #valuetype* valhalla/vector/Long2$Value:NotNull
>>>>
>>>> JavaObject NoEscape(NoEscape) [ 1154F [ 1148 1153 ]] 1136 Allocate
>>>> LocalVar [ 1136P [ 1153 ]] 1148 Proj ... #5
>>>> LocalVar [ 1148 1136P [ 1154b ]] 1153 CheckCastPP ...
>>>> #valuetype* valhalla/vector/Long2$Value:NotNull
>>>>
>>>> JavaObject NoEscape(NoEscape) [ [ 1053 ]] 1041 Allocate
>>>> LocalVar [ 1041P [ ]] 1053 Proj ... #5
>>>>
>>>> JavaObject NoEscape(NoEscape) [ 653F 651F [ 645 650 ]] 633 Allocate
>>>> LocalVar [ 633P [ 650 ]] 645 Proj ... #5
>>>> LocalVar [ 645 633P [ 653b 651b ]] 650 CheckCastPP ...
>>>> #valuetype* valhalla/vector/Long2$Value:NotNull
>>>>
>>>> JavaObject NoEscape(NoEscape) NSR [ 756F 754F 1581F 1579F [ 319 324
>>>> 532 1525 ]] 307 Allocate
>>>> LocalVar [ 307P [ 324 ]] 319 Proj ... #5
>>>> LocalVar [ 319 307P [ 532 1525 ]] 324 CheckCastPP ...
>>>> #valuetype* valhalla/vector/Long2$Value:NotNull
>>>> LocalVar [ 324 1264 1247P 307P [ 756b 754b ]] 532 Phi ...
>>>> #valuetype* valhalla/vector/Long2$Value:NotNull
>>>> LocalVar [ 1264 324 1247P 307P [ 1581b 1579b ]] 1525 Phi ...
>>>> #valuetype* valhalla/vector/Long2$Value:NotNull
>>>>
>>>> Scalar 1153 CheckCastPP ... #valuetype*
>>>> valhalla/vector/Long2$Value:NotNull
>>>> ++++ Eliminated: 1136 Allocate
>>>> Scalar 1041 Allocate ...
>>>> ++++ Eliminated: 1041 Allocate
>>>> Scalar 650 CheckCastPP ... #valuetype*
>>>> valhalla/vector/Long2$Value:NotNull
>>>> ++++ Eliminated: 633 Allocate
>>>
>
More information about the valhalla-dev
mailing list