RFR(L): 8074981 (Integer/FP scalar reduction optimization )

Wed Apr 1 17:40:33 UTC 2015

The push failed testing on Sparc with 64-bit fastdebug VM.
Michael, could you look what could go wrong there?
I will try to reproduce it on sparc.

hotspot/test/compiler/codegen/7100757/Test7100757.java

#  Internal Error 
(/opt/jprt/T/P1/031135.vkozlov/s/hotspot/src/share/vm/opto/superword.cpp:1742), 
pid=9157, tid=23
#  assert(_stk.length() == 0) failed: stk is empty
#

Stack: [0x0007fffeccc00000,0x0007fffeccd00000],  sp=0x0007fffecccf6780, 
  free space=985k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, 
C=native code)
V  [libjvm.so+0x15e78d4]  void VMError::report_and_die()+0x6c4
V  [libjvm.so+0xa762d0]  void report_vm_error(const char*,int,const 
char*,const char*)+0x70
V  [libjvm.so+0x14acd8c]  bool SuperWord::construct_bb()+0x4c
V  [libjvm.so+0x14a2704]  void SuperWord::SLP_extract()+0xc
V  [libjvm.so+0x10d8b6c]  void 
PhaseIdealLoop::build_and_optimize(bool,bool)+0x1374
V  [libjvm.so+0x9d9c70]  void Compile::Optimize()+0x24c8
V  [libjvm.so+0x9d0fdc]  Compile::Compile #Nvariant 
1(ciEnv*,C2Compiler*,ciMethod*,int,bool,bool,bool)+0x12fc
V  [libjvm.so+0x87595c]  void 
C2Compiler::compile_method(ciEnv*,ciMethod*,int)+0xf4

Current CompileTask:
C2:   1177  156       4       Test7100757::test (274 bytes)

Thanks,
Vladimir

On 3/31/15 5:37 PM, Vladimir Kozlov wrote:
>
> On 3/31/15 5:36 PM, Christian Thalinger wrote:
>>
>>> On Mar 30, 2015, at 7:50 PM, Vladimir Kozlov
>>> <vladimir.kozlov at oracle.com> wrote:
>>>
>>> On 3/30/15 7:20 PM, Berg, Michael C wrote:
>>>> Almost, it's more than that, there are missing components in long
>>>> support in AVX2, so we only allow what superword can currently
>>>> process safely and bypass the question of long support for
>>>> reductions until AVX3, where support is complete enough to allow
>>>> those forms of reductions.
>>>
>>> Okay.
>>>
>>>> Nils was the initial reviewer and sponsor, so Nils can you make
>>>> another pass and comment on the current webrev for the review.
>>>
>>> Nils is out for few days. Christian looked on this too, let him do
>>> second review.
>>
>> The only comment I have is this opening brace should be one the same
>> line:
>>
>> + void SuperWord::packset_sort(int n)
>> + {
>
> I will fix it before push.
>
> Thanks,
> Vladimir
>
>>
>> Otherwise this looks good.  Thanks.
>>
>>>
>>> Thanks,
>>> Vladimir
>>>
>>>>
>>>> Thanks,
>>>>
>>>> -Michael
>>>>
>>>> -----Original Message-----
>>>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
>>>> Sent: Monday, March 30, 2015 6:47 PM
>>>> To: Berg, Michael C
>>>> Cc: 'hotspot-compiler-dev at openjdk.java.net'
>>>> Subject: Re: RFR(L): 8074981 (Integer/FP scalar reduction
>>>> optimization )
>>>>
>>>> Here is updated webrev which addressed these and other issues:
>>>>
>>>> http://cr.openjdk.java.net/~kvn/8074981/webrev.01/
>>>>
>>>> Michael, I noticed that .ad file does not have matched instructions
>>>> for AddReductionVL. I assume it is because there is no avx3 yet. Right?
>>>>
>>>> Otherwise this look good to me. You need second review from an other
>>>> Reviewer since changes are big.
>>>>
>>>> Thanks,
>>>> Vladimir
>>>>
>>>> On 3/25/15 4:00 PM, Vladimir Kozlov wrote:
>>>>> Please, ignore previous email. I screwed up Michael's email address.
>>>>>
>>>>> Hi Michael,
>>>>>
>>>>> I have few major concerns which you need to address.
>>>>>
>>>>> Adding new field _attr to Node class should be avoided - it will
>>>>> increase significantly memory footprint of graph and not be used
>>>>> frequently (vectorization is rare case).
>>>>>
>>>>> NodeFlags has only 16 bits and you used 2. And I don't see how
>>>>> Flag_is_loop_carried_dep is used.
>>>>>
>>>>> All above goes to one question: why mark_reductions() is executed in
>>>>> loopopts before each unroll and not during superword processing?
>>>>> If you do mark_reductions() in superword you can use VectorSet to
>>>>> indicate nodes which are reduction nodes.
>>>>>
>>>>> And the same for _attr. Why to store alignment in Node and not use
>>>>> _node_info in packset_eval()?
>>>>>
>>>>> Small note. Instead of:
>>>>> +        Node *defNode = n->in(len - 1);
>>>>> use:
>>>>> +        Node *defNode = n->in(LoopNode::LoopBackControl);
>>>>>
>>>>> Thanks,
>>>>> Vladimir
>>>>>
>>>>>
>>>>> On 3/25/15 1:09 PM, Berg, Michael C wrote:
>>>>>> Christian/Nils: Any additional comments for the review, if not
>>>>>> Thursday I will upload the final webrev with the requested change.
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> -Michael
>>>>>>
>>>>>> *From:*Berg, Michael C
>>>>>> *Sent:* Thursday, March 19, 2015 5:55 PM
>>>>>> *To:* Christian Thalinger
>>>>>> *Cc:* hotspot-compiler-dev at openjdk.java.net
>>>>>> *Subject:* RE: RFR(L): 8074981 (Integer/FP scalar reduction
>>>>>> optimization )
>>>>>>
>>>>>> Christian, yes we could rely on the base class definitions instead,
>>>>>> since we are not augmenting arguments.
>>>>>>
>>>>>> I will remove the file changes after the review concludes in case
>>>>>> there are any other modifications.
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> -Michael
>>>>>>
>>>>>> *From:*Christian Thalinger [mailto:christian.thalinger at oracle.com]
>>>>>> *Sent:* Thursday, March 19, 2015 3:52 PM
>>>>>> *To:* Berg, Michael C
>>>>>> *Cc:* hotspot-compiler-dev at openjdk.java.net
>>>>>> <mailto:hotspot-compiler-dev at openjdk.java.net>
>>>>>> *Subject:* Re: RFR(L): 8074981 (Integer/FP scalar reduction
>>>>>> optimization )
>>>>>>
>>>>>>      On Mar 19, 2015, at 3:23 PM, Berg, Michael C
>>>>>>      <michael.c.berg at intel.com <mailto:michael.c.berg at intel.com>>
>>>>>> wrote:
>>>>>>
>>>>>>      I have updated the webrev contents after some feedback(with
>>>>>> no code
>>>>>>      changes), and Vladimir has placed it in location everyone can
>>>>>> access.
>>>>>>      Anyone should be able to apply the patch or review the code from
>>>>>>      this info:
>>>>>>
>>>>>>      http://cr.openjdk.java.net/~kvn/8074981/webrev.00/
>>>>>>
>>>>>> src/cpu/x86/vm/macroAssembler_x86.hpp:
>>>>>>
>>>>>> Why do we need these methods?  MacroAssembler extends Assembler.
>>>>>>
>>>>>>
>>>>>>      this replaces the JBS version of the webrev files for 8074981.
>>>>>>
>>>>>>      Thanks,
>>>>>>
>>>>>>      -Michael
>>>>>>
>>>>>>      -----Original Message-----
>>>>>>      From: Filipp Zhinkin [mailto:filipp.zhinkin at gmail.com]
>>>>>>      Sent: Thursday, March 19, 2015 12:55 AM
>>>>>>      To: Berg, Michael C
>>>>>>      Cc: hotspot-compiler-dev at openjdk.java.net
>>>>>>      <mailto:hotspot-compiler-dev at openjdk.java.net>
>>>>>>      Subject: Re: RFR(L): 8074981 (Integer/FP scalar reduction
>>>>>> optimization )
>>>>>>
>>>>>>      Michael,
>>>>>>
>>>>>>      I've got it, thank you for explanation.
>>>>>>
>>>>>>      Regards,
>>>>>>      Filipp.
>>>>>>
>>>>>>      On Wed, Mar 18, 2015 at 5:53 PM, Berg, Michael C
>>>>>>      <michael.c.berg at intel.com <mailto:michael.c.berg at intel.com>>
>>>>>> wrote:
>>>>>>
>>>>>>          Filipp, for large iteration loops, if I am taking your
>>>>>> meaning
>>>>>>          correctly, you could not do that without splitting the
>>>>>> loop and
>>>>>>          re-architecting it into a loop nest pair to manage the
>>>>>> reduction
>>>>>>          components.  Seems like the overhead from that scenario
>>>>>> could
>>>>>>          create cost issues where reductions could actually hamper
>>>>>>          performance in small vector expressions.  Right now we never
>>>>>>          degrade and generally benefit with the implementation as it
>>>>>>          stands with the reductions stitched into the vector unit
>>>>>>          computations directly.
>>>>>>
>>>>>>          Regarding sub/div/etc:
>>>>>>          For now we have waived off on non-commuting operations
>>>>>> like sub
>>>>>>          and div, they would have to be very strictly managed via
>>>>>>          pack-set placement.  But the answer is yes we could
>>>>>> support them.
>>>>>>
>>>>>>          Thanks,
>>>>>>          -Michael
>>>>>>
>>>>>>          -----Original Message-----
>>>>>>          From: Filipp Zhinkin [mailto:filipp.zhinkin at gmail.com]
>>>>>>          Sent: Wednesday, March 18, 2015 2:20 AM
>>>>>>          To: Berg, Michael C
>>>>>>          Cc: hotspot-compiler-dev at openjdk.java.net
>>>>>>          <mailto:hotspot-compiler-dev at openjdk.java.net>
>>>>>>          Subject: Re: RFR(L): 8074981 (Integer/FP scalar reduction
>>>>>>          optimization
>>>>>>          )
>>>>>>
>>>>>>          Hi Michael,
>>>>>>
>>>>>>          thank you for contributing such a great improvement!
>>>>>>
>>>>>>          Sorry if my question is silly, but I'm curious wouldn't
>>>>>> it be
>>>>>>          better to replace integer scalar reduction variable with a
>>>>>>          vector "Rv" in loop's prologue, compile loop's body as a
>>>>>> regular
>>>>>>          vectorized addition/multiplication, and reduce "Rv" to a
>>>>>> scalar
>>>>>>          in loop's epilogue?
>>>>>>
>>>>>>          Why you didn't add SubReduction* nodes?
>>>>>>
>>>>>>          Best regards,
>>>>>>          Filipp.
>>>>>>
>>>>>>
>>>>>>          On Tue, Mar 17, 2015 at 12:40 AM, Berg, Michael C
>>>>>>          <michael.c.berg at intel.com <mailto:michael.c.berg at intel.com>>
>>>>>> wrote:
>>>>>>
>>>>>>              Hi All,
>>>>>>
>>>>>>
>>>>>>
>>>>>>              We would like to contribute the Integer/FP scalar
>>>>>> reduction
>>>>>>              optimization from Intel.
>>>>>>
>>>>>>              The contribution is referenced as Bug ID 8074981 as a
>>>>>>              performance
>>>>>>              enhancement.
>>>>>>
>>>>>>
>>>>>>
>>>>>>              Please review this patch:
>>>>>>
>>>>>>              Bug-id: https://bugs.openjdk.java.net/browse/JDK-8074981
>>>>>>
>>>>>>              webrev:
>>>>>>
>>>>>> https://bugs.openjdk.java.net/secure/attachment/26101/webrev.zip
>>>>>>
>>>>>>
>>>>>>
>>>>>>              The optimization achieves as much as 2.3x on integer
>>>>>>              reductions and
>>>>>>              supports float and double precision optimizations
>>>>>>
>>>>>>              which also have significant optimization uplift an obey
>>>>>>              strict fp
>>>>>>              constraints.
>>>>>>
>>>>>>
>>>>>>
>>>>>>              Nils Eliasson has offered to sponsor this patch.
>>>>>>
>>>>>>
>>>>>>
>>>>>>              Thanks,
>>>>>>
>>>>>>
>>>>>>
>>>>>>              -Michael
>>>>>>
>>