[PATCH] 8217561 : X86: Add floating-point Math.min/max intrinsics, approval request
Vladimir Kozlov
vladimir.kozlov at oracle.com
Tue Feb 26 20:50:02 UTC 2019
Hi Bernard,
Unfortunately HotSpot does not collect profiling information per call site.
Currently calee's data is collected from all call sites. For example, you can have 2 calls with same *swapped* arguments
and you will get your balance == 0.
I would assume if you have multiply call sites in application the profiling data will be "flat" - balance value will be
near 0.
We did experiment few years ago to collect data per call site but we got negative result - no significant improvement in
code quality in cost of a lot more memory used for MethodData.
But your case (and cmove) is special since MDO is small for these methods. The only downside is to implement that you
will need a lot more complex changes in HotSpot.
https://bugs.openjdk.java.net/browse/JDK-8015416
Note, your code is correct in a sense that it use the same profiling data as it currently used for inlining these
methods. So I don't want to reject changes from the start.
The only simple solution I can think of is to use intrinsic only for hottest call site. Compare invocation_count with
caller's call site count to find if it is most used site which contributed the most profiling data.
Thanks,
Vladimir
On 2/26/19 3:16 AM, B. Blaser wrote:
> On Mon, 25 Feb 2019 at 19:42, Andrew Haley <aph at redhat.com> wrote:
>>
>> On 2/25/19 4:09 PM, B. Blaser wrote:
>>> Here it is:
>>>
>>> http://cr.openjdk.java.net/~bsrbnd/jdk8217561/webrev.05/
>>>
>>> I've just added the interesting use-case of inserting elements in a search tree.
>>> The insertion is absolutely unpredictable with random elements whereas
>>> fully predictable with sorted elements.
>>>
>>> Benchmarking this example is hard as it'd compute the average
>>> insertion time in a search tree which highly depends on its
>>> topography.
>>> That's why I decided to add it to the test-case doing some manual
>>> measurement using 'System.nanoTime()'.
>>>
>>> This experiment revealed at no surprise that the intrinsic is really
>>> worth being used only in the non-predictable case which the heuristic
>>> is for.
>>>
>>> Any Reviewer feedback and hopefully approval would be welcome
>>> (hotspot:tier1 is still OK on x86_64 xeon)!
>>
>> 6612 #ifdef AMD64
>>
>> Why? It's the same for every out-of-order processor with speculation.
>>
>> And the code *really* needs a comment to say what you're doing.
>
> Thanks for your comments, I've updated the patch accordingly here under.
>
> As it affects shared code, should I already push it to jdk/submit?
> Is the following procedure up-to-date?
>
> https://wiki.openjdk.java.net/display/Build/Submit+Repo
>
> Thanks,
> Bernard
>
>
> diff --git a/src/hotspot/share/opto/library_call.cpp
> b/src/hotspot/share/opto/library_call.cpp
> --- a/src/hotspot/share/opto/library_call.cpp
> +++ b/src/hotspot/share/opto/library_call.cpp
> @@ -6609,6 +6609,31 @@
>
> //------------------------------inline_fp_min_max------------------------------
> bool LibraryCallKit::inline_fp_min_max(vmIntrinsics::ID id) {
> + // The intrinsic should be used only when the API branches aren't
> predictable,
> + // the last one performing the most important comparison. The
> following heuristic
> + // uses the branch statistics to eventually bail out if necessary.
> +
> + ciMethod *c = callee();
> + ciMethodData *d = c->method_data();
> +
> + int taken = 0, not_taken = 0;
> + int invocations = d->invocation_count();
> +
> + if (invocations > 0) {
> + for (ciProfileData *p = d->first_data(); d->is_valid(p); p =
> d->next_data(p)) {
> + if (p->is_BranchData()) {
> + taken = ((ciBranchData*)p)->taken();
> + not_taken = ((ciBranchData*)p)->not_taken();
> + }
> + }
> +
> + double balance = (((double)taken) - ((double)not_taken)) /
> ((double)invocations);
> + balance = balance < 0 ? -balance : balance;
> + if ( balance > 0.2 ) {
> + return false;
> + }
> + }
> +
> Node *a = NULL;
> Node *b = NULL;
> Node *n = NULL;
>
More information about the hotspot-compiler-dev
mailing list