Intrinsics for Math.min and max

Wed Apr 2 21:42:49 UTC 2014

Compiler branch model will always be far too simplistic than actual
hardware; it's a cat and mouse game keeping up with h/w changes, different
vendors, murky details, compiler constraints on how much memory (and cpu to
a lesser degree) it uses for recording things, etc.  Sure, you can improve
but if all you're really tracking are frequencies, it's hard to improve
beyond highly likely/unlikely cases it seems.

I'd be surprised if pipelines get longer unless branch prediction improves
significantly alongside; e.g. haswell is about same length as ivy and sandy
bridge predecessors.  Additional execution units is definitely true; that's
already happening with current hardware.

Bit twiddling vs cmov is hard to guess at.  With register renaming (and
register files seem to be growing) and more exec units it's possible that
bit twiddling can end up faster, but sure, this is not a rule.

In some cases, one may be able to, e.g., pre process data somehow to ensure
branches are either eliminated or better predicted.  Of course there will
always be cases when it's not possible.  It's a hard problem writing an
optimizer, that's for sure :).

Sent from my phone
On Apr 2, 2014 4:53 PM, "Martin Grajcar" <maaartinus at gmail.com> wrote:

> On Wed, Apr 2, 2014 at 4:50 PM, Vitaly Davidovich <vitalyd at gmail.com>wrote:
>
>> I don't think it's reasonably possible to model hardware branch
>> prediction in software.  As you say, details are murky, the hardware
>> changes and advances, etc.
>>
> Actually every time the compiler decides between cmov and branch, it does
> use a branch prediction model. It's just that the currently employed model
> is ignorant of everything but the branching probability. Capturing all the
> details is something between too laborious and impossible, but improving a
> bit should be doable.
>
>> So clearly there will be cases where cmov will yield a speedup over
>> branches.  The question is whether compiler can ascertain that with very
>> good precision; otherwise code is penalized and doesn't take advantage of
>> any advances in branch prediction in hardware.
>>
> That's true, but when improved HW comes, the JIT can be improved as well.
> And improved HW may also mean bigger misprediction penalty (due to longer
> pipelines or due to more operations wasted as more execution units are
> provided).
>
>> By removing branches in java I meant rewriting the code/algorithm to not
>> branch (or branch more predictably); this is akin to what people do in
>> other languages/domains (e.g. c or c++ on hardware with weak predictors).
>> Granted level of control in java is limited to some degree, but certainly
>> avoiding branches or making them more predictable is feasible?
>>
> Branches get actually avoided in real code, e.g. in
> http://code.google.com/p/guava-libraries/source/browse/guava/src/com/google/common/math/IntMath.java?r=90a75c93246f0a403fc20275cb80d60e79978933#375.
> But in many cases the bit fiddling can't be as efficient as cmov.
>
> I can't imagine how to make a branch more predictable. Usually you have no
> control over the data processed, right?
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20140402/2c8d60d0/attachment.html>