JIT code generation for Long/Integer.compare
Vladimir Kozlov
vladimir.kozlov at oracle.com
Sat Sep 26 02:35:02 UTC 2015
For pattern matching I mean ideal transformation in ideal graph.
See, fro example, is_x2logic() in cfgnode.cpp and other transformations
for Phi node.
You will still need new CmpI3 node and changes in .ad file.
Thanks,
Vladimir
On 9/26/15 3:03 AM, Ian Rogers wrote:
> Thanks for the feedback! To summarize, we have a patch here that
> implements Long/Integer.compare using intrinsics there is a counter
> proposal that this would be better done by pattern matching. A
> limitation of the patch is it only implements the CmpI3 intrinsic for
> x86_64, it piggy backs for CmpL3 on the bytecode implementation already
> present for all architectures. Implementing CmpI3 isn't challenging
> given it is a just a minor tweak to CmpL3.
>
> Could you give more details on how you would expect the pattern matching
> approach to work? For example, a convenient place to do this would be in
> the ad file, but this would require porting. It could be done as a
> simplification but there aren't comparable in scope matchers, or am I
> missing this? The matcher would also have to be sufficiently general to
> handle variants in the bool nodes and constants which would necessitate
> multiple matchers if done thoroughly in the ad file. I don't disagree
> that pattern matching is a more generic solution and would avoid the
> CmpI3 node, the scope of the patch required for that seems substantial, no?
>
> Thanks,
> Ian
>
>
> On Thu, Sep 24, 2015 at 6:30 PM, Vladimir Kozlov
> <vladimir.kozlov at oracle.com <mailto:vladimir.kozlov at oracle.com>> wrote:
>
> I agree with Vitaly. It is better to use pattern matching (and
> generate CmpI3 and CmpL3 nodes) in subnode.cpp because java code is
> very simple and we may match other cases too.
>
> Also suggested changes are not complete since CmpI3 implementation
> was not added to other platforms.
>
> Thanks,
> Vladimir
>
> On 9/25/15 7:54 AM, Vitaly Davidovich wrote:
>
> I must admit it's a bit strange seeing this implemented via
> intrinsic -
> is this not possible via normal JIT optimizations? There's nothing
> really "intrinsic" about the code. I get that it's easier
> implementation-wise to latch on to a well known method, but what
> about
> similar code used without calling compare?
>
> sent from my phone
>
> On Sep 24, 2015 7:11 PM, "Ian Rogers" <irogers at google.com
> <mailto:irogers at google.com>
> <mailto:irogers at google.com <mailto:irogers at google.com>>> wrote:
>
> Agreed. The attached patch eliminates the cmpl3_flag
> enc_class and
> implements both cmpl3 and cmpi3 as you suggest.
>
> Thanks,
> Ian
>
> On Thu, Sep 24, 2015 at 12:51 PM, Christian Thalinger
> <christian.thalinger at oracle.com
> <mailto:christian.thalinger at oracle.com>
> <mailto:christian.thalinger at oracle.com
> <mailto:christian.thalinger at oracle.com>>> wrote:
>
> One comment about the .ad change: please don’t
> introduce new
> enc_class methods; use ins_encode %{ %} and MacroAssembler
> instructions instead, like this one:
>
> ins_encode %{
> Register Rp = $p$$Register;
> Register Rq = $q$$Register;
> Register Ry = $y$$Register;
> Label done;
> __ cmpl(Rp, Rq);
> __ jccb(Assembler::less, done);
> __ xorl(Ry, Ry);
> __ bind(done);
> %}
>
> Should be less painful too :-)
>
> On Sep 24, 2015, at 8:45 AM, Ian Rogers
> <irogers at google.com <mailto:irogers at google.com>
> <mailto:irogers at google.com
> <mailto:irogers at google.com>>> wrote:
>
> Below is a patch to add JIT code generation for
> Long/Integer.compare. It has been reviewed
> internally by
> rasbold at google.com <mailto:rasbold at google.com>
> <mailto:rasbold at google.com <mailto:rasbold at google.com>>. I'd
> like to
>
> open a bug for this, get it reviewed, etc. but I
> lack a JBS
> account. I'd appreciate help in getting this
> reviewed and merged.
>
> Thanks,
> Ian Rogers
>
> Support JIT code generation for Long/Integer.compare as
> intrinsics that fold with branches on their result.
>
> Introduce a CmpI3 ideal node mirroring the CmpL3
> node, that
> implements
> Integer.compare. Allow this to fold with a CmpI
> node. Spot
> Long/Integer.compare
> as CmpL3 and CmpI3 nodes. Add a CmpI3
> implementation for
> x86-64. On a
> micro-benchmark loop of:
> for (int i = 0; i < x.length; i++) {
> if (compare(x[i], y[i]) < 0) {
> count++;
> }
> }
> Int speed up averages 1.18x, long speed up averages
> 2.76x,
> over 30 runs of
> arrays sized at 5,000,000 elements. This can be
> improved with
> work on
> instruction selection.
> Raw data:
> Int before: 23129us, 99.5% range: 19935us - 26046us
> Int after: 19557us, 99.5% range: 16972us - 26072us
> Long before: 26935us, 99.5% range: 25776us - 29323us
> Long after: 9749us, 99.5% range: 8850us - 11968us
>
> <cmpi3-jdk9-tdiff.patch>
>
>
>
>
More information about the hotspot-compiler-dev
mailing list