optimizing acmp in L-World
sergey.kuksenko at oracle.com
Sat Sep 1 17:26:55 UTC 2018
I've checked performance here.
Cool, I like that patch. It provides better performance now.
I have to say that always locked mark word pattern with old acmp
provides the performance which quite close to before-valhalla-world.
Even it's better than klass bit test (didn't expect that, will discover
the reason later).
You can find all results here:
On 08/31/2018 06:04 AM, Tobias Hartmann wrote:
> I've found yet another benchmarking pitfall here. Typically JMH
> executes all subbenchmarks in
>>> separate VMs, that cases that measuring o1==o1 we that have only that branch in the profile. If
>>> you want to measure full acmp performance, full - means when all acmp branches are in the profile,
>>> you have to use yet another JMH option "-wm BULK" which provides bulk warmup of all combinations
>>> before measurement.
> Yes but that depends on what you want to measure. We should also have benchmarks for the case where
> C2 cuts of branches due to profile information suggesting that these are never taken.
If you execute that benchmark ('Trivial') by default or "-wm INDI" -
you'll get case when the only one branch is working and all others are
cut of by C2.
If use "-wm BULK" - also the only one branch is working, but all others
branches are NOT cut of by C2, because of they were working at warm up.
More information about the valhalla-dev