Restrictions for lock coarsening?

Fri Jan 2 17:24:33 PST 2009

Hi,

I've created a microbenchmark for some fine-graind locking pattern to
see if my workload can profit from Hotspot's  lock coarsening.
The benchmark itself is single-threaded so no contention occurs at
all, real-world workload might have very little contention.

The benchmark-loop itself looks like: while (true) { for (int i = 0; i
< 1000000; i++) { doItAndTakeTime(); }}
doItAndTakeTime() is one time synchronized, another time its guarded
by an ReentrantLock and for comparison I benchmarked an unsynchronized
version.

These were my results:
> Without synchronization:   48ms
> SunToolkit.awtLock/unLock: 92ms
> Monitor:                   108ms

I hoped to see some improvements when using a monitor because of  lock
coarsening, but with the server-jvm on my Core2Duo it seems
synchronization using a monitor is the slowest path.
I also didn't see any change when specifying -XX:-EliminateLocks, or
when manually unrolling the loop.

PrintCompilation says that the method in question is 106 bytes, but I
guess when inlining all the called methods it easily sums up to
~300bytes of bytecode.

Any idea what could be the reason why lock coarsening does not seem to
be active for this simple case?

Thank you in advance, Clemens