MVT-based vectors: first stab
Maurizio Cimadamore
maurizio.cimadamore at oracle.com
Wed Jul 5 14:39:43 UTC 2017
All JDK changes have been reviewed and pushed - thanks for the reviews.
We still need to submit changes for HS (the issue with type.cpp is
preventing Vlad's test to run correctly with JIT enabled). Can somebody
from the VM please look into that?
Cheers
Maurizio
On 03/07/17 22:01, Vladimir Ivanov wrote:
> Thanks a lot, Maurizio!
>
> Best regards,
> Vladimir Ivanov
>
> On 7/3/17 5:29 PM, Maurizio Cimadamore wrote:
>> All reviews have been submitted - once the code is in, I plan to push
>> the tests I have which were stressing some of the issues fixed by
>> these patches.
>>
>> Cheers
>> Maurizio
>>
>>
>> On 03/07/17 13:41, Maurizio Cimadamore wrote:
>>> Ok,
>>> I will file other issues as needed and start submitting reviews for
>>> your changes.
>>>
>>> Thanks
>>> Maurizio
>>>
>>>
>>> On 03/07/17 12:46, Vladimir Ivanov wrote:
>>>>> I was playing with some tests and your JDK patches seem to fix at
>>>>> least 2 problems I've encountered when testing:
>>>>
>>>> Glad to hear that. Thanks for testing!
>>>>
>>>> I assume those are corresponding bugs I filed:
>>>>
>>>>> * issues with double slots in lambda forms (wrong indices being
>>>>> generated by LambaFormBuilder)
>>>>
>>>> https://bugs.openjdk.java.net/browse/JDK-8183131
>>>> core-libs/j.l.i [MVT] 2-slot type (long & double) support is
>>>> broken in LambdaFormBuilder
>>>>
>>>>> * issues with (un)boxing of method handles leading to bogus method
>>>>> call results
>>>> https://bugs.openjdk.java.net/browse/JDK-8183132
>>>> core-libs/j.l.i [MVT] MHI.unboxResultHandle misses unboxing step
>>>> for Q-types
>>>>
>>>>> Overall, I think your patches makes the situation a lot better -
>>>>> what's the plan to check this code in? Do you want to make other
>>>>> iterations at it? Or can we start the process?
>>>>
>>>> On the JDK side, the changes are pretty sound in the current shape.
>>>> I planned to shepherd the fixes into the repo, but neither did
>>>> extensive testing nor finalized the actual fixes. Feel free to take
>>>> care of them.
>>>>
>>>> My near-term plans are: (1) implement missing LF intrinsics and (2)
>>>> look into box elimination issues on JIT-compiler side.
>>>>
>>>> Best regards,
>>>> Vladimir Ivanov
>>>>
>>>>> On 28/06/17 14:17, Vladimir Ivanov wrote:
>>>>>> Hi,
>>>>>>
>>>>>> I started playing with MVT-based vectors recently and want to
>>>>>> share my experience so far.
>>>>>>
>>>>>> For the first step, I decided to avoid specific treatment of
>>>>>> super-longs (Long2/4/8) in the JVM we have in Panama and just
>>>>>> mark them as VCCs for now:
>>>>>> @DeriveValueType
>>>>>> final class Long2 { public final long lo, hi; }
>>>>>>
>>>>>> As a test case I chose sum of array elements:
>>>>>>
>>>>>> long sum(QLong2[] va) {
>>>>>> QLong2 v = QLong2.default; // (0,0)
>>>>>> for (int i = 0; i < va.length; i++) {
>>>>>> QLong2 vi = va[i];
>>>>>> v = QLong2(v.lo + vi.lo, v.hi + vi.hi);
>>>>>> }
>>>>>> return v.lo + v.hi;
>>>>>> }
>>>>>>
>>>>>> Here are the changes:
>>>>>> http://cr.openjdk.java.net/~vlivanov/valhalla/vectors.mvt/webrev.00/
>>>>>>
>>>>>>
>>>>>> SUMMARY & RESULTS
>>>>>>
>>>>>> I hit some bugs along the way and had to extend the code for
>>>>>> Q-types in some places to get decent code shape generated. (I'll
>>>>>> shepherd the fixes in the repo separately.) I ended up with the
>>>>>> following generated code:
>>>>>>
>>>>>> http://cr.openjdk.java.net/~vlivanov/valhalla/vectors.mvt/sumArrayL2.log
>>>>>>
>>>>>>
>>>>>> Unfortunately, there are still 2 allocations left in the
>>>>>> generated code (out of 5 initially). C2 can't scalarize the
>>>>>> accumulator (v) inside the loop and has to put updated components
>>>>>> (lo & hi) in allocated heap buffer on every iteration over the
>>>>>> array (generated code [1], C2 EA output [2]).
>>>>>>
>>>>>>
>>>>>> GORY DETAILS
>>>>>>
>>>>>> hotspot/src/share/vm/opto/type.cpp
>>>>>>
>>>>>> * Type::get_typeflow_type() doesn't distinguish between
>>>>>> TypeValueTypePtr & TypeValueType
>>>>>>
>>>>>>
>>>>>> jdk/src/java.base/share/classes/java/lang/invoke/LambdaFormBuilder.java
>>>>>>
>>>>>>
>>>>>> * introduced some LF intrinsics, e.g. LOOP and
>>>>>> SELECT_ALTERNATIVE;
>>>>>>
>>>>>> * had to workaround control flow-related bugs in the bytecode
>>>>>> library (jdk.experimental.bytecode): broken JVM state after
>>>>>> goto_() (see FIXME marks in the code);
>>>>>>
>>>>>> * fixed 2-slot types support: wrong local indexes were used
>>>>>>
>>>>>>
>>>>>>
>>>>>> jdk/src/java.base/share/classes/java/lang/invoke/MethodHandleImpl.java
>>>>>>
>>>>>>
>>>>>> * MHI.unboxResultHandle misses unboxing step for Q-types
>>>>>>
>>>>>>
>>>>>> jdk/src/java.base/share/classes/java/lang/invoke/MethodHandles.java
>>>>>> jdk/src/java.base/share/classes/jdk/experimental/value/ValueType.java
>>>>>>
>>>>>>
>>>>>> * had to add a specialization for arraylength on Q-typed
>>>>>> array, because C2 can't handle arraylength Object[] when Q[]
>>>>>> instance is on stack;
>>>>>>
>>>>>>
>>>>>> jdk/src/java.base/share/classes/jdk/experimental/bytecode/TypedCodeBuilder.java:
>>>>>>
>>>>>>
>>>>>> * changes required to workaround goto_() bug
>>>>>>
>>>>>> * fixed CP patching support
>>>>>>
>>>>>>
>>>>>> jdk/src/java.base/share/classes/jdk/experimental/value/MethodHandleBuilder.java
>>>>>>
>>>>>>
>>>>>> * fixed CP patching support
>>>>>>
>>>>>>
>>>>>> hotspot/src/share/vm/memory/metachunk.hpp
>>>>>> hotspot/src/share/vm/oops/method.hpp
>>>>>> Fix optimized JVM build.
>>>>>>
>>>>>> Best regards,
>>>>>> Vladimir Ivanov
>>>>>>
>>>>>> [1]
>>>>>> http://cr.openjdk.java.net/~vlivanov/valhalla/vectors.mvt/sumArrayL2.log
>>>>>>
>>>>>>
>>>>>> ;; B7: # B8 <- B9 top-of-loop
>>>>>>
>>>>>> mov %r9,0x10(%rsp)
>>>>>> mov 0x8(%rsp),%r11d
>>>>>>
>>>>>> ;; B8: # B16 B9 <- B6 B7 Loop: B8-B7 inner
>>>>>>
>>>>>> mov %r11d,0x8(%rsp)
>>>>>> movabs $0x7c0060420,%rsi ;
>>>>>> {metadata('valhalla/vector/Long2$Value')}
>>>>>> ...
>>>>>> callq 0x0000000117579820 ; {runtime_call _new_instance_Java}
>>>>>>
>>>>>> ;; B9: # B7 B10 <- B8
>>>>>>
>>>>>> mov %rax,%r9
>>>>>> mov 0x10(%rsp),%r11
>>>>>> mov 0x10(%r11),%r10
>>>>>> mov 0x18(%r11),%r11
>>>>>> movslq %ebp,%r8
>>>>>> shl $0x4,%r8
>>>>>> mov (%rsp),%rcx
>>>>>> mov 0x18(%rcx,%r8,1),%rax
>>>>>> mov 0x10(%rcx,%r8,1),%r8 ;*vaload
>>>>>> add %r11,%rax ;*ladd
>>>>>> mov %rax,0x18(%r9)
>>>>>> add %r10,%r8 ;*ladd
>>>>>> mov %r8,0x10(%r9)
>>>>>> inc %ebp ;*iadd
>>>>>> cmp 0x8(%rsp),%ebp
>>>>>> jl <B7> ;*if_icmpge
>>>>>>
>>>>>> [2] Pruned EA output (full output in the log)
>>>>>> ======== Connection graph for valhalla.vector.VectorTest::sumArrayL2
>>>>>> JavaObject NoEscape(NoEscape) NSR [ 1267F 1265F 756F 754F 1581F
>>>>>> 1579F [ 1259 1264 532 1525 ]] 1247 Allocate
>>>>>> LocalVar [ 1247P [ 1264 ]] 1259 Proj ... #5
>>>>>> LocalVar [ 1259 1247P [ 1267b 1265b 532 1525 ]] 1264
>>>>>> CheckCastPP ... #valuetype* valhalla/vector/Long2$Value:NotNull
>>>>>> LocalVar [ 324 1264 1247P 307P [ 756b 754b ]] 532 Phi ...
>>>>>> #valuetype* valhalla/vector/Long2$Value:NotNull
>>>>>> LocalVar [ 1264 324 1247P 307P [ 1581b 1579b ]] 1525 Phi ...
>>>>>> #valuetype* valhalla/vector/Long2$Value:NotNull
>>>>>>
>>>>>> JavaObject NoEscape(NoEscape) [ 1154F [ 1148 1153 ]] 1136 Allocate
>>>>>> LocalVar [ 1136P [ 1153 ]] 1148 Proj ... #5
>>>>>> LocalVar [ 1148 1136P [ 1154b ]] 1153 CheckCastPP ...
>>>>>> #valuetype* valhalla/vector/Long2$Value:NotNull
>>>>>>
>>>>>> JavaObject NoEscape(NoEscape) [ [ 1053 ]] 1041 Allocate
>>>>>> LocalVar [ 1041P [ ]] 1053 Proj ... #5
>>>>>>
>>>>>> JavaObject NoEscape(NoEscape) [ 653F 651F [ 645 650 ]] 633 Allocate
>>>>>> LocalVar [ 633P [ 650 ]] 645 Proj ... #5
>>>>>> LocalVar [ 645 633P [ 653b 651b ]] 650 CheckCastPP ...
>>>>>> #valuetype* valhalla/vector/Long2$Value:NotNull
>>>>>>
>>>>>> JavaObject NoEscape(NoEscape) NSR [ 756F 754F 1581F 1579F [ 319
>>>>>> 324 532 1525 ]] 307 Allocate
>>>>>> LocalVar [ 307P [ 324 ]] 319 Proj ... #5
>>>>>> LocalVar [ 319 307P [ 532 1525 ]] 324 CheckCastPP ...
>>>>>> #valuetype* valhalla/vector/Long2$Value:NotNull
>>>>>> LocalVar [ 324 1264 1247P 307P [ 756b 754b ]] 532 Phi ...
>>>>>> #valuetype* valhalla/vector/Long2$Value:NotNull
>>>>>> LocalVar [ 1264 324 1247P 307P [ 1581b 1579b ]] 1525 Phi ...
>>>>>> #valuetype* valhalla/vector/Long2$Value:NotNull
>>>>>>
>>>>>> Scalar 1153 CheckCastPP ... #valuetype*
>>>>>> valhalla/vector/Long2$Value:NotNull
>>>>>> ++++ Eliminated: 1136 Allocate
>>>>>> Scalar 1041 Allocate ...
>>>>>> ++++ Eliminated: 1041 Allocate
>>>>>> Scalar 650 CheckCastPP ... #valuetype*
>>>>>> valhalla/vector/Long2$Value:NotNull
>>>>>> ++++ Eliminated: 633 Allocate
>>>>>
>>>
>>
More information about the valhalla-dev
mailing list