[PATCH] 8217561 : X86: Add floating-point Math.min/max intrinsics, approval request

B. Blaser bsrbnd at gmail.com
Tue Feb 19 15:16:16 UTC 2019


On Tue, 19 Feb 2019 at 11:03, Andrew Dinn <adinn at redhat.com> wrote:
>
> Andrew is suggesting a different approach altogether. Instead of either
> using the intrinsic all the time or not using it all the time he is
> suggesting using it when the branch prediction data suggests it will be
> effective.

So, I did some experiments and it seems that the most important branch
is the last one.
If data is well balanced like in the initial benchmark I posted,
branch prediction is poor and the intrinsic is 2x faster, in all other
cases it'd be better to avoid it:

bool LibraryCallKit::inline_fp_min_max(vmIntrinsics::ID id) {
printf("##### inline_fp_min_max\n");

ciMethod *c = callee();
ciMethodData *d = c->method_data();

int index = 0;
int taken = 0;
int not_taken = 0;
int invocations = d->invocation_count();

if (invocations > 0) {
    for (ciProfileData *p = d->first_data(); d->is_valid(p); p =
d->next_data(p)) {
      if (p->is_BranchData()) {
        index++;
        taken = ((ciBranchData*)p)->taken();
        not_taken = ((ciBranchData*)p)->not_taken();
printf("##### calls %dx: jmp %d taken %dx not %dx\n", invocations,
index, taken, not_taken);
      }
    }

    double balance = (((double)taken) - ((double)not_taken)) /
((double)invocations);
    balance = balance < 0 ? -balance : balance;
    if ( balance > 0.2 ) {
printf("##### good predictions: %f\n", balance);
        return false;
    }
}

Comments?

Thanks,
Bernard


More information about the hotspot-compiler-dev mailing list