MVT-based vectors: first stab
Maurizio Cimadamore
maurizio.cimadamore at oracle.com
Mon Jul 3 12:41:42 UTC 2017
Ok,
I will file other issues as needed and start submitting reviews for your
changes.
Thanks
Maurizio
On 03/07/17 12:46, Vladimir Ivanov wrote:
>> I was playing with some tests and your JDK patches seem to fix at
>> least 2 problems I've encountered when testing:
>
> Glad to hear that. Thanks for testing!
>
> I assume those are corresponding bugs I filed:
>
>> * issues with double slots in lambda forms (wrong indices being
>> generated by LambaFormBuilder)
>
> https://bugs.openjdk.java.net/browse/JDK-8183131
> core-libs/j.l.i [MVT] 2-slot type (long & double) support is broken
> in LambdaFormBuilder
>
>> * issues with (un)boxing of method handles leading to bogus method
>> call results
> https://bugs.openjdk.java.net/browse/JDK-8183132
> core-libs/j.l.i [MVT] MHI.unboxResultHandle misses unboxing step for
> Q-types
>
>> Overall, I think your patches makes the situation a lot better -
>> what's the plan to check this code in? Do you want to make other
>> iterations at it? Or can we start the process?
>
> On the JDK side, the changes are pretty sound in the current shape. I
> planned to shepherd the fixes into the repo, but neither did extensive
> testing nor finalized the actual fixes. Feel free to take care of them.
>
> My near-term plans are: (1) implement missing LF intrinsics and (2)
> look into box elimination issues on JIT-compiler side.
>
> Best regards,
> Vladimir Ivanov
>
>> On 28/06/17 14:17, Vladimir Ivanov wrote:
>>> Hi,
>>>
>>> I started playing with MVT-based vectors recently and want to share
>>> my experience so far.
>>>
>>> For the first step, I decided to avoid specific treatment of
>>> super-longs (Long2/4/8) in the JVM we have in Panama and just mark
>>> them as VCCs for now:
>>> @DeriveValueType
>>> final class Long2 { public final long lo, hi; }
>>>
>>> As a test case I chose sum of array elements:
>>>
>>> long sum(QLong2[] va) {
>>> QLong2 v = QLong2.default; // (0,0)
>>> for (int i = 0; i < va.length; i++) {
>>> QLong2 vi = va[i];
>>> v = QLong2(v.lo + vi.lo, v.hi + vi.hi);
>>> }
>>> return v.lo + v.hi;
>>> }
>>>
>>> Here are the changes:
>>> http://cr.openjdk.java.net/~vlivanov/valhalla/vectors.mvt/webrev.00/
>>>
>>> SUMMARY & RESULTS
>>>
>>> I hit some bugs along the way and had to extend the code for Q-types
>>> in some places to get decent code shape generated. (I'll shepherd
>>> the fixes in the repo separately.) I ended up with the following
>>> generated code:
>>>
>>> http://cr.openjdk.java.net/~vlivanov/valhalla/vectors.mvt/sumArrayL2.log
>>>
>>>
>>> Unfortunately, there are still 2 allocations left in the generated
>>> code (out of 5 initially). C2 can't scalarize the accumulator (v)
>>> inside the loop and has to put updated components (lo & hi) in
>>> allocated heap buffer on every iteration over the array (generated
>>> code [1], C2 EA output [2]).
>>>
>>>
>>> GORY DETAILS
>>>
>>> hotspot/src/share/vm/opto/type.cpp
>>>
>>> * Type::get_typeflow_type() doesn't distinguish between
>>> TypeValueTypePtr & TypeValueType
>>>
>>>
>>> jdk/src/java.base/share/classes/java/lang/invoke/LambdaFormBuilder.java
>>>
>>> * introduced some LF intrinsics, e.g. LOOP and SELECT_ALTERNATIVE;
>>>
>>> * had to workaround control flow-related bugs in the bytecode
>>> library (jdk.experimental.bytecode): broken JVM state after goto_()
>>> (see FIXME marks in the code);
>>>
>>> * fixed 2-slot types support: wrong local indexes were used
>>>
>>>
>>>
>>> jdk/src/java.base/share/classes/java/lang/invoke/MethodHandleImpl.java
>>>
>>> * MHI.unboxResultHandle misses unboxing step for Q-types
>>>
>>>
>>> jdk/src/java.base/share/classes/java/lang/invoke/MethodHandles.java
>>> jdk/src/java.base/share/classes/jdk/experimental/value/ValueType.java
>>>
>>> * had to add a specialization for arraylength on Q-typed array,
>>> because C2 can't handle arraylength Object[] when Q[] instance is on
>>> stack;
>>>
>>>
>>> jdk/src/java.base/share/classes/jdk/experimental/bytecode/TypedCodeBuilder.java:
>>>
>>>
>>> * changes required to workaround goto_() bug
>>>
>>> * fixed CP patching support
>>>
>>>
>>> jdk/src/java.base/share/classes/jdk/experimental/value/MethodHandleBuilder.java
>>>
>>>
>>> * fixed CP patching support
>>>
>>>
>>> hotspot/src/share/vm/memory/metachunk.hpp
>>> hotspot/src/share/vm/oops/method.hpp
>>> Fix optimized JVM build.
>>>
>>> Best regards,
>>> Vladimir Ivanov
>>>
>>> [1]
>>> http://cr.openjdk.java.net/~vlivanov/valhalla/vectors.mvt/sumArrayL2.log
>>>
>>>
>>> ;; B7: # B8 <- B9 top-of-loop
>>>
>>> mov %r9,0x10(%rsp)
>>> mov 0x8(%rsp),%r11d
>>>
>>> ;; B8: # B16 B9 <- B6 B7 Loop: B8-B7 inner
>>>
>>> mov %r11d,0x8(%rsp)
>>> movabs $0x7c0060420,%rsi ; {metadata('valhalla/vector/Long2$Value')}
>>> ...
>>> callq 0x0000000117579820 ; {runtime_call _new_instance_Java}
>>>
>>> ;; B9: # B7 B10 <- B8
>>>
>>> mov %rax,%r9
>>> mov 0x10(%rsp),%r11
>>> mov 0x10(%r11),%r10
>>> mov 0x18(%r11),%r11
>>> movslq %ebp,%r8
>>> shl $0x4,%r8
>>> mov (%rsp),%rcx
>>> mov 0x18(%rcx,%r8,1),%rax
>>> mov 0x10(%rcx,%r8,1),%r8 ;*vaload
>>> add %r11,%rax ;*ladd
>>> mov %rax,0x18(%r9)
>>> add %r10,%r8 ;*ladd
>>> mov %r8,0x10(%r9)
>>> inc %ebp ;*iadd
>>> cmp 0x8(%rsp),%ebp
>>> jl <B7> ;*if_icmpge
>>>
>>> [2] Pruned EA output (full output in the log)
>>> ======== Connection graph for valhalla.vector.VectorTest::sumArrayL2
>>> JavaObject NoEscape(NoEscape) NSR [ 1267F 1265F 756F 754F 1581F
>>> 1579F [ 1259 1264 532 1525 ]] 1247 Allocate
>>> LocalVar [ 1247P [ 1264 ]] 1259 Proj ... #5
>>> LocalVar [ 1259 1247P [ 1267b 1265b 532 1525 ]] 1264 CheckCastPP
>>> ... #valuetype* valhalla/vector/Long2$Value:NotNull
>>> LocalVar [ 324 1264 1247P 307P [ 756b 754b ]] 532 Phi ...
>>> #valuetype* valhalla/vector/Long2$Value:NotNull
>>> LocalVar [ 1264 324 1247P 307P [ 1581b 1579b ]] 1525 Phi ...
>>> #valuetype* valhalla/vector/Long2$Value:NotNull
>>>
>>> JavaObject NoEscape(NoEscape) [ 1154F [ 1148 1153 ]] 1136 Allocate
>>> LocalVar [ 1136P [ 1153 ]] 1148 Proj ... #5
>>> LocalVar [ 1148 1136P [ 1154b ]] 1153 CheckCastPP ...
>>> #valuetype* valhalla/vector/Long2$Value:NotNull
>>>
>>> JavaObject NoEscape(NoEscape) [ [ 1053 ]] 1041 Allocate
>>> LocalVar [ 1041P [ ]] 1053 Proj ... #5
>>>
>>> JavaObject NoEscape(NoEscape) [ 653F 651F [ 645 650 ]] 633 Allocate
>>> LocalVar [ 633P [ 650 ]] 645 Proj ... #5
>>> LocalVar [ 645 633P [ 653b 651b ]] 650 CheckCastPP ...
>>> #valuetype* valhalla/vector/Long2$Value:NotNull
>>>
>>> JavaObject NoEscape(NoEscape) NSR [ 756F 754F 1581F 1579F [ 319 324
>>> 532 1525 ]] 307 Allocate
>>> LocalVar [ 307P [ 324 ]] 319 Proj ... #5
>>> LocalVar [ 319 307P [ 532 1525 ]] 324 CheckCastPP ...
>>> #valuetype* valhalla/vector/Long2$Value:NotNull
>>> LocalVar [ 324 1264 1247P 307P [ 756b 754b ]] 532 Phi ...
>>> #valuetype* valhalla/vector/Long2$Value:NotNull
>>> LocalVar [ 1264 324 1247P 307P [ 1581b 1579b ]] 1525 Phi ...
>>> #valuetype* valhalla/vector/Long2$Value:NotNull
>>>
>>> Scalar 1153 CheckCastPP ... #valuetype*
>>> valhalla/vector/Long2$Value:NotNull
>>> ++++ Eliminated: 1136 Allocate
>>> Scalar 1041 Allocate ...
>>> ++++ Eliminated: 1041 Allocate
>>> Scalar 650 CheckCastPP ... #valuetype*
>>> valhalla/vector/Long2$Value:NotNull
>>> ++++ Eliminated: 633 Allocate
>>
More information about the valhalla-dev
mailing list