CR for RFR 8153998

Vladimir Kozlov vladimir.kozlov at oracle.com
Thu Apr 14 23:26:59 UTC 2016


On 4/14/16 3:35 PM, Christian Thalinger wrote:
>
>> On Apr 14, 2016, at 8:44 AM, Berg, Michael C <michael.c.berg at intel.com <mailto:michael.c.berg at intel.com>> wrote:
>>
>> Christian,
>> There is but I would have to anchor it via a gpr register, instead I am treating it like nop emits (yet another side effect), with some guidance for placement.
>
> That’s unfortunate but I understand.  I’m fine with it then.

You can try to generate restore mask on loop exit in mach instruction for CountedLoopEnd node with predicate(n->has_vect_mask_set()). has_vect_mask_set flag should be set in CountedLoopEnd node after 
creation of CreateMaskI node. Note, CreateMaskI will be only generated for clean counted post loop with on vector iteration - there should not be any other branches there.

Vladimir

>
>> Regards,
>> Michael
>> *From:*Christian Thalinger [mailto:christian.thalinger at oracle.com]
>> *Sent:*Thursday, April 14, 2016 11:20 AM
>> *To:*Berg, Michael C <michael.c.berg at intel.com <mailto:michael.c.berg at intel.com>>
>> *Cc:*hotspot-compiler-dev at openjdk.java.net <mailto:hotspot-compiler-dev at openjdk.java.net>
>> *Subject:*Re: CR for RFR 8153998
>>
>>     On Apr 13, 2016, at 11:35 AM, Berg, Michael C <michael.c.berg at intel.com <mailto:michael.c.berg at intel.com>> wrote:
>>     See below for context.
>>     Regards,
>>     Michael
>>     *From:*Christian Thalinger [mailto:christian.thalinger at oracle.com]
>>     *Sent:*Wednesday, April 13, 2016 2:08 PM
>>     *To:*Berg, Michael C <michael.c.berg at intel.com <mailto:michael.c.berg at intel.com>>
>>     *Cc:*hotspot-compiler-dev at openjdk.java.net <mailto:hotspot-compiler-dev at openjdk.java.net>
>>     *Subject:*Re: CR for RFR 8153998
>>
>>         On Apr 12, 2016, at 8:26 PM, Berg, Michael C <michael.c.berg at intel.com <mailto:michael.c.berg at intel.com>> wrote:
>>         Hi Folks,
>>
>>         I would like to contribute Programmable SIMD as implemented on multi-versioned post loops. See:https://bugs.openjdk.java.net/browse/JDK-8151573for the first half of the implementation.
>>         This component delivers mask programmed post loops which execute in a single iteration in place of fixup scalar loops which used to take many iterations to complete work for user loops.
>>         Currently I have enabled this optimization for x86 only, specifically for machines with masked data predication implemented as per fully enabled EVEX targets.  It delivers up to 2x
>>         performance and has been modeled over a large number of loop lengths and forms of loops.
>>         This code was tested as follows(see jbs entry below):
>>
>>         Bug-id:https://bugs.openjdk.java.net/browse/JDK-8153998
>>
>>         webrev:
>>         http://cr.openjdk.java.net/~mcberg/8153998/webrev.01a/
>>
>>     +//------------------------------MachMskNode-----------------------------------
>>
>>     +// Machine function Msk Node
>>
>>     +class MachMskNode : public MachIdealNode {
>>
>>     Does “Msk” mean mask?  Then we should call it MachMaskNode.
>>     <MCB> Ok, that’s easy enough.
>>     Also, I don’t quite understand why we have:
>>
>>     +instruct set_mask(rRegI dst, rRegI src) %{
>>
>>     +  predicate(VM_Version::supports_avx512vl());
>>
>>     +  match(Set dst (MaskCreateI src));
>>
>>     +  effect(TEMP dst);
>>
>>     +  format %{ "createmsk   $dst, $src" %}
>>
>>     +  ins_encode %{
>>
>>     +    __ createmsk($dst$$Register, $src$$Register);
>>
>>     +  %}
>>
>>     but:
>>
>>     +  void MachMskNode::emit(CodeBuffer &cbuf, PhaseRegAlloc*) const {
>>
>>     +    MacroAssembler _masm(&cbuf);
>>
>>     +    __ restoremsk();
>>
>>     +  }
>>
>>     The reason: Currently k registers or mask registers are not allocated, meaning we have to treat their usage as side effects.
>>     The case with set_mask is to take our remaining iterations as an index and provide a mask used in k1 that is applicable to its post loop.
>>     The subsequent restore, preplaces the default value back into k1.  The set_mask rule is posed in such a way that it ensures that the side effect value will survive optimization.
>>     The restore is fully a side effect with no produced definition in rule space as mask registers are not formal definitions.
>>
>> Hmm.  So, there is no way we can have a RestoreMaskINode?
>>
>>         Thanks,
>>         Michael
>


More information about the hotspot-compiler-dev mailing list