RFR(L): 8074981 (Integer/FP scalar reduction optimization )
Vladimir Kozlov
vladimir.kozlov at oracle.com
Wed Mar 25 23:00:44 UTC 2015
Please, ignore previous email. I screwed up Michael's email address.
Hi Michael,
I have few major concerns which you need to address.
Adding new field _attr to Node class should be avoided - it will
increase significantly memory footprint of graph and not be used
frequently (vectorization is rare case).
NodeFlags has only 16 bits and you used 2. And I don't see how
Flag_is_loop_carried_dep is used.
All above goes to one question: why mark_reductions() is executed in
loopopts before each unroll and not during superword processing?
If you do mark_reductions() in superword you can use VectorSet to
indicate nodes which are reduction nodes.
And the same for _attr. Why to store alignment in Node and not use
_node_info in packset_eval()?
Small note. Instead of:
+ Node *defNode = n->in(len - 1);
use:
+ Node *defNode = n->in(LoopNode::LoopBackControl);
Thanks,
Vladimir
On 3/25/15 1:09 PM, Berg, Michael C wrote:
> Christian/Nils: Any additional comments for the review, if not Thursday
> I will upload the final webrev with the requested change.
>
> Thanks,
>
> -Michael
>
> *From:*Berg, Michael C
> *Sent:* Thursday, March 19, 2015 5:55 PM
> *To:* Christian Thalinger
> *Cc:* hotspot-compiler-dev at openjdk.java.net
> *Subject:* RE: RFR(L): 8074981 (Integer/FP scalar reduction optimization )
>
> Christian, yes we could rely on the base class definitions instead,
> since we are not augmenting arguments.
>
> I will remove the file changes after the review concludes in case there
> are any other modifications.
>
> Thanks,
>
> -Michael
>
> *From:*Christian Thalinger [mailto:christian.thalinger at oracle.com]
> *Sent:* Thursday, March 19, 2015 3:52 PM
> *To:* Berg, Michael C
> *Cc:* hotspot-compiler-dev at openjdk.java.net
> <mailto:hotspot-compiler-dev at openjdk.java.net>
> *Subject:* Re: RFR(L): 8074981 (Integer/FP scalar reduction optimization )
>
> On Mar 19, 2015, at 3:23 PM, Berg, Michael C
> <michael.c.berg at intel.com <mailto:michael.c.berg at intel.com>> wrote:
>
> I have updated the webrev contents after some feedback(with no code
> changes), and Vladimir has placed it in location everyone can access.
> Anyone should be able to apply the patch or review the code from
> this info:
>
> http://cr.openjdk.java.net/~kvn/8074981/webrev.00/
>
> src/cpu/x86/vm/macroAssembler_x86.hpp:
>
> Why do we need these methods? MacroAssembler extends Assembler.
>
>
> this replaces the JBS version of the webrev files for 8074981.
>
> Thanks,
>
> -Michael
>
> -----Original Message-----
> From: Filipp Zhinkin [mailto:filipp.zhinkin at gmail.com]
> Sent: Thursday, March 19, 2015 12:55 AM
> To: Berg, Michael C
> Cc: hotspot-compiler-dev at openjdk.java.net
> <mailto:hotspot-compiler-dev at openjdk.java.net>
> Subject: Re: RFR(L): 8074981 (Integer/FP scalar reduction optimization )
>
> Michael,
>
> I've got it, thank you for explanation.
>
> Regards,
> Filipp.
>
> On Wed, Mar 18, 2015 at 5:53 PM, Berg, Michael C
> <michael.c.berg at intel.com <mailto:michael.c.berg at intel.com>> wrote:
>
> Filipp, for large iteration loops, if I am taking your meaning
> correctly, you could not do that without splitting the loop and
> re-architecting it into a loop nest pair to manage the reduction
> components. Seems like the overhead from that scenario could
> create cost issues where reductions could actually hamper
> performance in small vector expressions. Right now we never
> degrade and generally benefit with the implementation as it
> stands with the reductions stitched into the vector unit
> computations directly.
>
> Regarding sub/div/etc:
> For now we have waived off on non-commuting operations like sub
> and div, they would have to be very strictly managed via
> pack-set placement. But the answer is yes we could support them.
>
> Thanks,
> -Michael
>
> -----Original Message-----
> From: Filipp Zhinkin [mailto:filipp.zhinkin at gmail.com]
> Sent: Wednesday, March 18, 2015 2:20 AM
> To: Berg, Michael C
> Cc: hotspot-compiler-dev at openjdk.java.net
> <mailto:hotspot-compiler-dev at openjdk.java.net>
> Subject: Re: RFR(L): 8074981 (Integer/FP scalar reduction
> optimization
> )
>
> Hi Michael,
>
> thank you for contributing such a great improvement!
>
> Sorry if my question is silly, but I'm curious wouldn't it be
> better to replace integer scalar reduction variable with a
> vector "Rv" in loop's prologue, compile loop's body as a regular
> vectorized addition/multiplication, and reduce "Rv" to a scalar
> in loop's epilogue?
>
> Why you didn't add SubReduction* nodes?
>
> Best regards,
> Filipp.
>
>
> On Tue, Mar 17, 2015 at 12:40 AM, Berg, Michael C
> <michael.c.berg at intel.com <mailto:michael.c.berg at intel.com>> wrote:
>
> Hi All,
>
>
>
> We would like to contribute the Integer/FP scalar reduction
> optimization from Intel.
>
> The contribution is referenced as Bug ID 8074981 as a
> performance
> enhancement.
>
>
>
> Please review this patch:
>
> Bug-id: https://bugs.openjdk.java.net/browse/JDK-8074981
>
> webrev:
> https://bugs.openjdk.java.net/secure/attachment/26101/webrev.zip
>
>
>
> The optimization achieves as much as 2.3x on integer
> reductions and
> supports float and double precision optimizations
>
> which also have significant optimization uplift an obey
> strict fp
> constraints.
>
>
>
> Nils Eliasson has offered to sponsor this patch.
>
>
>
> Thanks,
>
>
>
> -Michael
>
More information about the hotspot-compiler-dev
mailing list