optimizing acmp in L-World

Sat Sep 1 17:26:55 UTC 2018

Hi Tobias,

I've checked performance here.

Cool, I like that patch. It provides better performance now.

I have to say that always locked mark word pattern  with old acmp 
provides the performance which quite close to before-valhalla-world.
Even it's better than klass bit test (didn't expect that, will discover 
the reason later).
You can find all results here: 
http://cr.openjdk.java.net/~skuksenko/valhalla/acmp1/acmp_charts_0831.png

On 08/31/2018 06:04 AM, Tobias Hartmann wrote:
> I've found yet another benchmarking pitfall here. Typically JMH 
> executes all subbenchmarks in
>>> separate VMs, that cases that measuring o1==o1 we that have only that branch in the profile. If
>>> you want to measure full acmp performance, full - means when all acmp branches are in the profile,
>>> you have to use yet another JMH option  "-wm BULK" which provides bulk warmup of all combinations
>>> before measurement.
> Yes but that depends on what you want to measure. We should also have benchmarks for the case where
> C2 cuts of branches due to profile information suggesting that these are never taken.
>
>
If you execute that benchmark ('Trivial') by default or "-wm INDI" - 
you'll get case when the only one branch is working and all others are 
cut of by C2.
If use "-wm BULK" - also the only one branch is working, but  all others 
branches are NOT cut of by C2, because of they were working at warm up.