JNI-performance - Is it really that fast?

Tue Mar 25 23:22:40 UTC 2008

On 2008-3-25, at 7:13 PM, Clemens Eisserer wrote:
> Hi Dave,
>
> Thanks a lot for answering that detailed. Congratulations to the
> BiasedLocking work, its really great to see such inovative features in
> the JVM :)

Mark Moir, Bill Scherer and I wrote something about it early on.  Ken  
Russell and Dave Detlefs are responsible for the current  
implementation.  You can find a link to their oopsla paper via http://blogs.sun.com/dave/entry/biased_locking_in_hotspot 
.

Dave

>
>
>> good choice when "synchronized" doesn't fit the bill, such as when  
>> you
>> might need timed waits, trylock, hand-over-hand "coupled" locking,
>> etc.   ReentrantLock also tends to be used in situations where the
>> programmer is sure multiple threads are actively coordinating their
>> operation, meaning that ReentrantLock would benefit little from  
>> biased
>> locking.  For most synchronization -- contended or uncontended --
>> you're better off with synchronized as you get the benefits of biased
>> locking, adaptive spinning, potential lock elision via escape
>> analysis, and in the future, hardware transactional lock elision (http://blogs.sun.com/dave/entry/rock_style_transactional_memory_lock
>> ).
>
> I was asking because I did some benchmarking and (on my dual-core
> machine, with an obscure microbenchmark) the grabbing the AWTLock for
> a 1x1 rectangle takes almost as much time as the whole Xlib-processing
> + JNI-overhead.
>
> The code looks like:
> SunToolkit.awtLock();
> long xgc = validate(sg2d); //simple, pure Java method
> XFillRect(sg2d.surfaceData.getNativeOps(), .....); // native method
> SunToolkit.awtUnlock();
>
> 10mio 1x1 rect:
> 600ms native method commented out
> 850ms locking commented out.
> 1400ms locking+native method
>
> The numbers include all the code-path from Graphics.fillRect() up to
> X11Renderer.fillRetc.
>
> As you can see locking (at least on my machine) is almost as expensive
> as the JNI-Downcall and the real work together. I used the
> server-compiler.
>
> The AWTLock was a java-monitor till JDK5 (not 100% sure), but was a
> victim of contention because it was used also from native code and
> sometimes from multiple threads (but I guess it was not heavy
> contended in most cases).
> IN JDK6 it was replaced with a ReentrantLock, some features like
> tryLock() where used to implement the new OpenGL pipeline ...
> performance also improved.
>
>> In your case if the lock is ever shared -- that is, locked by  
>> multiple
>> threads during its lifetime -- then biased locking probably won't
>> provide the latency reduction benefit you're after.   The object will
>> likely become unbiased at some point.  I suspect that sharing will
>> ultimately occur in your case, but be infrequent, correct?
>
> Exactly, the most likely scenary is that there is one rendering-thread
> which does million of locks, and a few other calls from native code
> (currently they upcall from C to lock the ReentrantLock).
> It could also happen that there are two or more active rendering
> threads at the same time, but this is not really common and a fallback
> to unbiased would be totally ok.
> Wouldn't be a BiasedLock something worth to implement, maybe with the
> possibility how fast/likely the Lock can become unbiased?
>
> However this really has not a lot prioritry to me ... I really should
> care about other things ... somehow I entraped into this when deciding
> the design of my XRender-Java2d pipeline. Sorry for all the traffic...
>
> lg Clemens