RFR (S): 8151002: Make Assembler methods vextract and vinsert match actual instructions

Mon Mar 7 22:27:28 UTC 2016

I filed JDK-8151409[1] to cover the reordering of the methods. For 
sanity it would probably make sense to do it separately from the removal 
of the nds checks as described in JDK-8151003[2].

Thanks for the reviews!

Cheers,
Mikael

[1] https://bugs.openjdk.java.net/browse/JDK-8151409
[2] https://bugs.openjdk.java.net/browse/JDK-8151003

On 3/7/2016 2:05 PM, Vladimir Kozlov wrote:
> Okay, please, file RFE then. You can push what you have now.
>
> Thanks,
> Vladimir
>
> On 3/7/16 1:21 PM, Mikael Vidstedt wrote:
>>
>> I agree with Mike that it would make sense to split the inserts and the
>> extracts, but can I please suggest that we do the rearranging as a
>> follow-up change?
>>
>> Cheers,
>> Mikael
>>
>> On 3/7/2016 1:17 PM, Berg, Michael C wrote:
>>> If we are going to do that, we might as well break the inserts and the
>>> extracts up sorting by size of operation in both files.
>>>
>>> -----Original Message-----
>>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
>>> Sent: Monday, March 07, 2016 12:16 PM
>>> To: Mikael Vidstedt; Berg, Michael C;
>>> hotspot-compiler-dev at openjdk.java.net
>>> Subject: Re: RFR (S): 8151002: Make Assembler methods vextract and
>>> vinsert match actual instructions
>>>
>>> Changes looks good.
>>>
>>> In assembler_x86.cpp can you group v*128*, v*64*, v32* in separate 3
>>> groups as you have them in assembler_x86.hpp?
>>>
>>> Thanks,
>>> Vladimir
>>>
>>> On 3/7/16 11:32 AM, Mikael Vidstedt wrote:
>>>> New webrev:
>>>>
>>>> http://cr.openjdk.java.net/~mikael/webrevs/8151002/webrev.03/webrev/
>>>>
>>>> Unfortunately I think I messed up the incremental webrev, so in order
>>>> to not cause confusion I'm not including it here, sorry.
>>>>
>>>> Summary:
>>>>
>>>> I added pseudo instructions for inserting/extracting _low and _high
>>>> parts of the vector registers. Note that it only applies for the cases
>>>> where there is a clear low and high to speak of - that is, in the
>>>> cases where the instruction operates on the high or low *half* of a
>>>> register.
>>>> For instructions like vinsert32x4 (128bit copy to/from a 512 bit
>>>> register) there are four different sources/targets, so high and low
>>>> are not applicable, so there are no pseudo instructions for them.
>>>>
>>>> The macroAssembler methods now take a uint8_t as well, this was
>>>> accidentally left out in the last webrev.
>>>>
>>>> I kept the nds->is_valid() checks for now, cleaning that up is covered
>>>> by JDK-8151003[1].
>>>>
>>>> I also updated a couple of comments.
>>>>
>>>> Cheers,
>>>> Mikael
>>>>
>>>> [1] https://bugs.openjdk.java.net/browse/JDK-8151003
>>>>
>>>>
>>>>
>>>> On 3/3/2016 1:16 PM, Vladimir Ivanov wrote:
>>>>> On 3/3/16 7:08 AM, Berg, Michael C wrote:
>>>>>> Vladimir (K), just for the time being as the problem isn't just
>>>>>> confined to these instructions (the nds issue). I have assigned the
>>>>>> bug below to myself and will take a holistic view over the issue in
>>>>>> its full context.
>>>>>>
>>>>>> The instructions modified in the webrev, like in the documentation
>>>>>> that exists regarding their definitions, are all programmable via
>>>>>> what is loosely labeled as the imm8 field in the formal
>>>>>> documentation.  I think we should leave them that way. The onus of
>>>>>> these changes was to make instructions look more like their ISA
>>>>>> manual definitions.  I think Vladimir Ivanov was saying, and please
>>>>>> chime in Vladimir if I do not interpret correctly, wasn't high/low,
>>>>>> it was leaving a signature like what we had in place in the macro
>>>>>> assembler, and invoking the precise names there.  I don't think that
>>>>>> is needed though, as the macro assembler's job is to interpret a
>>>>>> meaning and do a mapping.
>>>>> I'm all for the proposed change in Assembler.
>>>>>
>>>>> My point is that vmovdqu/vinserti128h/vextracti128h(...) are more
>>>>> informative than vinserti128(...,0/1) & vextracti128(..., 0/1) in the
>>>>> code. So, keeping original functions in the MacroAssembler, but
>>>>> migrating them to new Assembler versions looks reasonable.
>>>>>
>>>>> But I can live with both variaints.
>>>>>
>>>>> Best regards,
>>>>> Vladimir Ivanov
>>>>>
>>>>>> Regards,
>>>>>> Michael
>>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
>>>>>> Sent: Wednesday, March 02, 2016 3:32 PM
>>>>>> To: hotspot-compiler-dev at openjdk.java.net
>>>>>> Cc: Berg, Michael C
>>>>>> Subject: Re: RFR (S): 8151002: Make Assembler methods vextract and
>>>>>> vinsert match actual instructions
>>>>>>
>>>>>> On 3/2/16 3:02 PM, Mikael Vidstedt wrote:
>>>>>>> After discussing with Vladimir off-list we agreed that changing the
>>>>>>> type
>>>>>> It was Vladimir Ivanov.
>>>>>>
>>>>>>> of the immediate (imm8) argument to uint8_t is both clearer, has
>>>>>>> the potential to catch incorrect uses of the functions, and also
>>>>>>> makes the asserts more straightforward. In addition to that
>>>>>>> Vladimir noted that I had accidentally included newline in the
>>>>>>> assert messages.
>>>>>>>
>>>>>>> New webrev:
>>>>>>>
>>>>>>> Full:
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~mikael/webrevs/8151002/webrev.02/webrev
>>>>>>> /
>>>>>> I agree with Vladimir I. that we should have macroassembler
>>>>>> instructions vinserti128high, vinserti128low, etc. instead of
>>>>>> passing imm8. It is more informative.
>>>>>>
>>>>>> Also why we add new nds->is_valid() checks into assembler
>>>>>> instructions?
>>>>>> We are going to remove them:
>>>>>>
>>>>>> https://bugs.openjdk.java.net/browse/JDK-8151003
>>>>>>
>>>>>> I know that Mikael had a discussion about this with Michael. So I
>>>>>> would like to see arguments here. Michael?
>>>>>>
>>>>>> Current code always pass correct registers and x86 Manual requires
>>>>>> to have valid registers.
>>>>>>
>>>>>> Thanks,
>>>>>> Vladimir
>>>>>>
>>>>>>> Incremental from webrev.01:
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~mikael/webrevs/8151002/webrev.02.incr/w
>>>>>>> ebr
>>>>>>> ev/
>>>>>>>
>>>>>>> Cheers,
>>>>>>> Mikael
>>>>>>>
>>>>>>> On 2016-03-02 13:12, Mikael Vidstedt wrote:
>>>>>>>> Updated webrev:
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~mikael/webrevs/8151002/webrev.01/webre
>>>>>>>> v/
>>>>>>>>
>>>>>>>> Incremental from webrev.00:
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~mikael/webrevs/8151002/webrev.01.incr/
>>>>>>>> web
>>>>>>>> rev/
>>>>>>>>
>>>>>>>> Comments below...
>>>>>>>>
>>>>>>>> On 2016-03-01 23:36, Vladimir Ivanov wrote:
>>>>>>>>> Nice cleanup, Mikael!
>>>>>>>>>
>>>>>>>>> src/cpu/x86/vm/assembler_x86.hpp:
>>>>>>>>>
>>>>>>>>> Outdated comments:
>>>>>>>>>       // Copy low 128bit into high 128bit of YMM registers.
>>>>>>>>>
>>>>>>>>>       // Load/store high 128bit of YMM registers which does not
>>>>>>>>> destroy other half.
>>>>>>>>>
>>>>>>>>>      // Copy low 256bit into high 256bit of ZMM registers.
>>>>>>>> Updated, thanks for catching!
>>>>>>>>
>>>>>>>>> src/cpu/x86/vm/assembler_x86.cpp:
>>>>>>>>>
>>>>>>>>> !   emit_int8(imm8 & 0x01);
>>>>>>>>>
>>>>>>>>> Maybe additionally assert valid imm8 range?
>>>>>>>> Good idea, I had added asserts earlier but removed them. I added
>>>>>>>> them back again!
>>>>>>>>
>>>>>>>>> Maybe keep vinsert*h variants and move them to MacroAssembler?
>>>>>>>>> They look clearer in some contextes:
>>>>>>>>>
>>>>>>>>> -      __ vextractf128h(Address(rsp, base_addr+n*16),
>>>>>>>>> as_XMMRegister(n));
>>>>>>>>> +      __ vextractf128(Address(rsp, base_addr+n*16),
>>>>>>>>> as_XMMRegister(n), 1);
>>>>>>>> Can I suggest that we try to live without them for a while and see
>>>>>>>> how much we miss them? I think having it there may actually be
>>>>>>>> more confusing in many cases :)
>>>>>>>>
>>>>>>>> Cheers,
>>>>>>>> Mikael
>>>>>>>>
>>>>>>>>> Otherwise, looks good.
>>>>>>>>>
>>>>>>>>> Best regards,
>>>>>>>>> Vladimir Ivanov
>>>>>>>>>
>>>>>>>>> On 3/2/16 3:25 AM, Mikael Vidstedt wrote:
>>>>>>>>>> Please review the following change which updates the various
>>>>>>>>>> vextract* and vinsert* methods in assembler_x86 &
>>>>>>>>>> macroAssembler_x86 to better match the real HW instructions,
>>>>>>>>>> which also has the benefit of providing the full
>>>>>>>>>> functionality/flexibility of the instructions where earlier only
>>>>>>>>>> some specific modes were supported. Put differently, with this
>>>>>>>>>> change it's much easier to correlate the methods to the Intel
>>>>>>>>>> manual and understand what they actually do.
>>>>>>>>>>
>>>>>>>>>> Specifically, the vinsert* family of instructions take three
>>>>>>>>>> registers and an immediate which decide how the bits should be
>>>>>>>>>> shuffled around, but without this change the method only allowed
>>>>>>>>>> two of the registers to be specified, and the immediate was
>>>>>>>>>> hard-coded to 0x01.
>>>>>>>>>>
>>>>>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8151002
>>>>>>>>>> Webrev:
>>>>>>>>>> http://cr.openjdk.java.net/~mikael/webrevs/8151002/webrev.00/web
>>>>>>>>>> rev
>>>>>>>>>> /
>>>>>>>>>>
>>>>>>>>>> Special thanks to Mike Berg for helping discuss, co-develop, and
>>>>>>>>>> test the change!
>>>>>>>>>>
>>>>>>>>>> Cheers,
>>>>>>>>>> Mikael
>>>>>>>>>>
>>