8226721: Missing intrinsics for Math.ceil, floor, rint
Bhateja, Jatin
jatin.bhateja at intel.com
Wed Sep 11 04:47:41 UTC 2019
Hi Barnard,
Your suggestion to add corresponding support in C1 and Template Interpreter looks good.
Though, instead of going the stub way which incur additional runtime penalty can't we directly emit the instruction sequence.
I will add the support for rounding primitives in C1 and Interpreter in a separate patch.
Kindly let me know if there are any comments in C2 side changes pertinent to this patch.
Thanks,
Jatin
> -----Original Message-----
> From: B. Blaser <bsrbnd at gmail.com>
> Sent: Monday, September 9, 2019 6:02 PM
> To: Bhateja, Jatin <jatin.bhateja at intel.com>
> Cc: Doug Simon <doug.simon at oracle.com>; hotspot-compiler-
> dev at openjdk.java.net
> Subject: Re: 8226721: Missing intrinsics for Math.ceil, floor, rint
>
> Hi Jatin and Doug,
>
> I think we could probably also add a stub routine and call it from the
> interpreter along with C1?
> I experimented with something like below for fp min/max intrinsics which
> showed a great gain in both standard and reduction scenarios.
>
> In 'stubGenerator_x86_64.cpp':
>
> address generate_libmMinD() {
> StubCodeMark mark(this, "StubRoutines", "libmMinD");
>
> address start = __ pc();
>
> const XMMRegister a = xmm0;
> const XMMRegister b = xmm1;
> const XMMRegister atmp = xmm2;
> const XMMRegister btmp = xmm3;
> const XMMRegister tmp = xmm4;
> const XMMRegister dst = xmm0;
>
> __ enter(); // required for proper stackwalking of RuntimeStub frame
>
> int vector_len = Assembler::AVX_128bit;
> __ blendvpd(atmp, a, b, a, vector_len);
> __ blendvpd(btmp, b, a, a, vector_len);
> __ vminsd(tmp, atmp, btmp);
> __ cmppd(btmp, atmp, atmp, Assembler::_false, vector_len);
> __ blendvpd(dst, tmp, atmp, btmp, vector_len);
>
> __ leave(); // required for proper stackwalking of RuntimeStub frame
> __ ret(0);
>
> return start;
> }
>
> In 'templateInterpreterGenerator_x86_64.cpp':
>
> address
> TemplateInterpreterGenerator::generate_minD_entry(AbstractInterpreter::
> MethodKind
> kind) {
> address entry = __ pc();
>
> __ movdbl(xmm0, Address(rsp, wordSize));
> __ movdbl(xmm1, Address(rsp, 3 * wordSize));
>
> __ call(RuntimeAddress(CAST_FROM_FN_PTR(address,
> StubRoutines::minD())));
>
> __ pop(rax);
> __ mov(rsp, r13);
> __ jmp(rax);
>
> return entry;
> }
>
> And in 'c1_LIRGenerator_x86.cpp':
>
> void LIRGenerator::do_MinDIntrinsic(Intrinsic* x) {
> LIRItem value(x->argument_at(0), this);
> value.set_destroys_register();
>
> LIR_Opr calc_result = rlock_result(x);
> LIR_Opr result_reg = result_register_for(x->type());
>
> CallingConvention* cc = NULL;
>
> LIRItem value1(x->argument_at(1), this);
> value1.set_destroys_register();
>
> BasicTypeList signature(2);
> signature.append(T_DOUBLE);
> signature.append(T_DOUBLE);
> cc = frame_map()->c_calling_convention(&signature);
> value.load_item_force(cc->at(0));
> value1.load_item_force(cc->at(1));
>
> __ call_runtime_leaf(StubRoutines::minD(), getThreadTemp(), result_reg,
> cc->args());
>
> __ move(result_reg, calc_result);
> }
>
> If comments are encouraging, I'll probably post a RFR for something like this
> soon along with GRAAL support for fp min/max unless someone else is
> already working on it?
>
> Thanks,
> Bernard
>
> On Wed, 4 Sep 2019 at 14:18, Bhateja, Jatin <jatin.bhateja at intel.com<mailto:jatin.bhateja at intel.com>>
> wrote:
> >
> > Hi Doug,
> >
> > Thanks for sharing the link.
> > As suggested, will open a follow-up issue for Graal support for these
> intrinsic and work over it.
> >
> > Regards,
> > Jatin
> >
> > > -----Original Message-----
> > > From: Doug Simon <doug.simon at oracle.com<mailto:doug.simon at oracle.com>>
> > > Sent: Tuesday, September 3, 2019 5:31 PM
> > > To: Bhateja, Jatin <jatin.bhateja at intel.com<mailto:jatin.bhateja at intel.com>>
> > > Cc: hotspot-compiler-dev at openjdk.java.net<mailto:hotspot-compiler-dev at openjdk.java.net>
> > > Subject: Re: 8226721: Missing intrinsics for Math.ceil, floor, rint
> > >
> > > Hi Jatin,
> > >
> > > It would be great to see these intrinsics applied to Graal as well,
> > > either in this issue or a follow up issue.
> > >
> > > As an example of how to do this, you can look at
> > > https://github.com/oracle/graal/pull/1171
> > >
> > > -Doug
> > >
> > > > On 3 Sep 2019, at 11:41, Bhateja, Jatin <jatin.bhateja at intel.com<mailto:jatin.bhateja at intel.com>>
> wrote:
> > > >
> > > > Hi All,
> > > >
> > > > Please find a patch with the following changes:-
> > > > 1) Intrincifiation for Math.ceil/floor/rint.
> > > > 2) Auto-vectorizer handling.
> > > >
> > > > JBS: https://bugs.openjdk.java.net/browse/JDK-8226721
> > > > Webrev: http://cr.openjdk.java.net/~jbhateja/8226721/webrev.05
> > > >
> > > > Kindly review it.
> > > >
> > > > Best Regards,
> > > > Jatin
> >
More information about the hotspot-compiler-dev
mailing list