8226721: Missing intrinsics for Math.ceil, floor, rint
B. Blaser
bsrbnd at gmail.com
Mon Sep 9 12:31:45 UTC 2019
Hi Jatin and Doug,
I think we could probably also add a stub routine and call it from the
interpreter along with C1?
I experimented with something like below for fp min/max intrinsics
which showed a great gain in both standard and reduction scenarios.
In 'stubGenerator_x86_64.cpp':
address generate_libmMinD() {
StubCodeMark mark(this, "StubRoutines", "libmMinD");
address start = __ pc();
const XMMRegister a = xmm0;
const XMMRegister b = xmm1;
const XMMRegister atmp = xmm2;
const XMMRegister btmp = xmm3;
const XMMRegister tmp = xmm4;
const XMMRegister dst = xmm0;
__ enter(); // required for proper stackwalking of RuntimeStub frame
int vector_len = Assembler::AVX_128bit;
__ blendvpd(atmp, a, b, a, vector_len);
__ blendvpd(btmp, b, a, a, vector_len);
__ vminsd(tmp, atmp, btmp);
__ cmppd(btmp, atmp, atmp, Assembler::_false, vector_len);
__ blendvpd(dst, tmp, atmp, btmp, vector_len);
__ leave(); // required for proper stackwalking of RuntimeStub frame
__ ret(0);
return start;
}
In 'templateInterpreterGenerator_x86_64.cpp':
address TemplateInterpreterGenerator::generate_minD_entry(AbstractInterpreter::MethodKind
kind) {
address entry = __ pc();
__ movdbl(xmm0, Address(rsp, wordSize));
__ movdbl(xmm1, Address(rsp, 3 * wordSize));
__ call(RuntimeAddress(CAST_FROM_FN_PTR(address, StubRoutines::minD())));
__ pop(rax);
__ mov(rsp, r13);
__ jmp(rax);
return entry;
}
And in 'c1_LIRGenerator_x86.cpp':
void LIRGenerator::do_MinDIntrinsic(Intrinsic* x) {
LIRItem value(x->argument_at(0), this);
value.set_destroys_register();
LIR_Opr calc_result = rlock_result(x);
LIR_Opr result_reg = result_register_for(x->type());
CallingConvention* cc = NULL;
LIRItem value1(x->argument_at(1), this);
value1.set_destroys_register();
BasicTypeList signature(2);
signature.append(T_DOUBLE);
signature.append(T_DOUBLE);
cc = frame_map()->c_calling_convention(&signature);
value.load_item_force(cc->at(0));
value1.load_item_force(cc->at(1));
__ call_runtime_leaf(StubRoutines::minD(), getThreadTemp(),
result_reg, cc->args());
__ move(result_reg, calc_result);
}
If comments are encouraging, I'll probably post a RFR for something
like this soon along with GRAAL support for fp min/max unless someone
else is already working on it?
Thanks,
Bernard
On Wed, 4 Sep 2019 at 14:18, Bhateja, Jatin <jatin.bhateja at intel.com> wrote:
>
> Hi Doug,
>
> Thanks for sharing the link.
> As suggested, will open a follow-up issue for Graal support for these intrinsic and work over it.
>
> Regards,
> Jatin
>
> > -----Original Message-----
> > From: Doug Simon <doug.simon at oracle.com>
> > Sent: Tuesday, September 3, 2019 5:31 PM
> > To: Bhateja, Jatin <jatin.bhateja at intel.com>
> > Cc: hotspot-compiler-dev at openjdk.java.net
> > Subject: Re: 8226721: Missing intrinsics for Math.ceil, floor, rint
> >
> > Hi Jatin,
> >
> > It would be great to see these intrinsics applied to Graal as well, either in this
> > issue or a follow up issue.
> >
> > As an example of how to do this, you can look at
> > https://github.com/oracle/graal/pull/1171
> >
> > -Doug
> >
> > > On 3 Sep 2019, at 11:41, Bhateja, Jatin <jatin.bhateja at intel.com> wrote:
> > >
> > > Hi All,
> > >
> > > Please find a patch with the following changes:-
> > > 1) Intrincifiation for Math.ceil/floor/rint.
> > > 2) Auto-vectorizer handling.
> > >
> > > JBS: https://bugs.openjdk.java.net/browse/JDK-8226721
> > > Webrev: http://cr.openjdk.java.net/~jbhateja/8226721/webrev.05
> > >
> > > Kindly review it.
> > >
> > > Best Regards,
> > > Jatin
>
More information about the hotspot-compiler-dev
mailing list