CR for RFR 8153998

Christian Thalinger christian.thalinger at oracle.com
Thu Apr 14 22:35:03 UTC 2016


> On Apr 14, 2016, at 8:44 AM, Berg, Michael C <michael.c.berg at intel.com> wrote:
> 
> Christian,
>  
> There is but I would have to anchor it via a gpr register, instead I am treating it like nop emits (yet another side effect), with some guidance for placement.

That’s unfortunate but I understand.  I’m fine with it then.

>  
> Regards,
> Michael
>  
>  
> From: Christian Thalinger [mailto:christian.thalinger at oracle.com] 
> Sent: Thursday, April 14, 2016 11:20 AM
> To: Berg, Michael C <michael.c.berg at intel.com>
> Cc: hotspot-compiler-dev at openjdk.java.net
> Subject: Re: CR for RFR 8153998
>  
>  
> On Apr 13, 2016, at 11:35 AM, Berg, Michael C <michael.c.berg at intel.com <mailto:michael.c.berg at intel.com>> wrote:
>  
> See below for context.
>  
> Regards,
> Michael
>  
> From: Christian Thalinger [mailto:christian.thalinger at oracle.com <mailto:christian.thalinger at oracle.com>] 
> Sent: Wednesday, April 13, 2016 2:08 PM
> To: Berg, Michael C <michael.c.berg at intel.com <mailto:michael.c.berg at intel.com>>
> Cc: hotspot-compiler-dev at openjdk.java.net <mailto:hotspot-compiler-dev at openjdk.java.net>
> Subject: Re: CR for RFR 8153998
>  
>  
> On Apr 12, 2016, at 8:26 PM, Berg, Michael C <michael.c.berg at intel.com <mailto:michael.c.berg at intel.com>> wrote:
>  
>  <>Hi Folks,
> 
> I would like to contribute Programmable SIMD as implemented on multi-versioned post loops.  See: https://bugs.openjdk.java.net/browse/JDK-8151573 <https://bugs.openjdk.java.net/browse/JDK-8151573> for the first half of the implementation.
> This component delivers mask programmed post loops which execute in a single iteration in place of fixup scalar loops which used to take many iterations to complete work for user loops.
> Currently I have enabled this optimization for x86 only, specifically for machines with masked data predication implemented as per fully enabled EVEX targets.  It delivers up to 2x performance and has been modeled over a large number of loop lengths and forms of loops.
>  
> This code was tested as follows (see jbs entry below):
> 
> Bug-id: https://bugs.openjdk.java.net/browse/JDK-8153998 <https://bugs.openjdk.java.net/browse/JDK-8153998>
> 
> webrev:
> http://cr.openjdk.java.net/~mcberg/8153998/webrev.01a/ <http://cr.openjdk.java.net/~mcberg/8153998/webrev.01a/>
>  
> +//------------------------------MachMskNode-----------------------------------
> +// Machine function Msk Node
> +class MachMskNode : public MachIdealNode {
> Does “Msk” mean mask?  Then we should call it MachMaskNode.
>  
> <MCB> Ok, that’s easy enough.
>  
> Also, I don’t quite understand why we have:
> +instruct set_mask(rRegI dst, rRegI src) %{
> +  predicate(VM_Version::supports_avx512vl());
> +  match(Set dst (MaskCreateI src));
> +  effect(TEMP dst);
> +  format %{ "createmsk   $dst, $src" %}
> +  ins_encode %{
> +    __ createmsk($dst$$Register, $src$$Register);
> +  %}
> but:
> +  void MachMskNode::emit(CodeBuffer &cbuf, PhaseRegAlloc*) const {
> +    MacroAssembler _masm(&cbuf);
> +    __ restoremsk();
> +  }
>  
> The reason: Currently k registers or mask registers are not allocated, meaning we have to treat their usage as side effects.
> The case with set_mask is to take our remaining iterations as an index and provide a mask used in k1 that is applicable to its post loop.
> The subsequent restore, preplaces the default value back into k1.  The set_mask rule is posed in such a way that it ensures that the side effect value will survive optimization.  
> The restore is fully a side effect with no produced definition in rule space as mask registers are not formal definitions.
>  
> Hmm.  So, there is no way we can have a RestoreMaskINode?
>  
> Thanks,
> Michael

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160414/f6e2132c/attachment-0001.html>


More information about the hotspot-compiler-dev mailing list