JNI-performance - Is it really that fast?

Wed Mar 26 03:26:40 UTC 2008

>

Following up on the other part of the your email ...

> I was asking because I did some benchmarking and (on my dual-core
> machine, with an obscure microbenchmark) the grabbing the AWTLock for
> a 1x1 rectangle takes almost as much time as the whole Xlib-processing
> + JNI-overhead.
>
> The code looks like:
> SunToolkit.awtLock();
> long xgc = validate(sg2d); //simple, pure Java method
> XFillRect(sg2d.surfaceData.getNativeOps(), .....); // native method
> SunToolkit.awtUnlock();
>
> 10mio 1x1 rect:
> 600ms native method commented out
> 850ms locking commented out.
> 1400ms locking+native method
>
> The numbers include all the code-path from Graphics.fillRect() up to
> X11Renderer.fillRetc.
>
> As you can see locking (at least on my machine) is almost as expensive
> as the JNI-Downcall and the real work together. I used the
> server-compiler.
>
> The AWTLock was a java-monitor till JDK5 (not 100% sure), but was a
> victim of contention because it was used also from native code and
> sometimes from multiple threads (but I guess it was not heavy
> contended in most cases).
> IN JDK6 it was replaced with a ReentrantLock, some features like
> tryLock() where used to implement the new OpenGL pipeline ...
> performance also improved.

I recall discussions with the 2d folks on that topic.

>
>
>> In your case if the lock is ever shared -- that is, locked by  
>> multiple
>> threads during its lifetime -- then biased locking probably won't
>> provide the latency reduction benefit you're after.   The object will
>> likely become unbiased at some point.  I suspect that sharing will
>> ultimately occur in your case, but be infrequent, correct?
>
> Exactly, the most likely scenary is that there is one rendering-thread
> which does million of locks, and a few other calls from native code
> (currently they upcall from C to lock the ReentrantLock).
> It could also happen that there are two or more active rendering
> threads at the same time, but this is not really common and a fallback
> to unbiased would be totally ok.
> Wouldn't be a BiasedLock something worth to implement, maybe with the
> possibility how fast/likely the Lock can become unbiased?

Unbiasing a lock, or more properly revoking its bias, can be  
exceptionally expensive.   In the extreme case it might require a full  
stop-the-world thread rendezvous to ensure there are no races between  
the revoking thread and the bias holding thread.  (Ken's oopsla paper  
describes some ways to mitigate that cost).   While it's not strictly  
fundamental to biased locking, the mechanism today also makes good use  
of the fact that we know that synchronization -- enter and exit  
operations -- are typically balanced ("last locked is first unlocked")  
allowing the JVM to hide additional locking information on the  
stack.   A BiasedLock in the vein of ReentrantLock wouldn't have that  
property, making the implementation a bit more complicated.    
Generally, though, BiasedLock seems like a bet in the wrong  
direction.   If you suspect that biased locking would be a win then  
for the most part you'd be better off with synchronized instead of yet  
another special construct.   (I worry too that things like BiasedLock  
don't necessarily age well. Biased locking is purely a response to  
processor-local CAS/cmpxchg latency.  As processor vendors have turned  
their attention to the issue the latency has shown relative  
improvement so today's good idea might be tomorrow's legacy baggage  
that needs to be towed around in perpetuity).

Regards
Dave

>
>
> However this really has not a lot prioritry to me ... I really should
> care about other things ... somehow I entraped into this when deciding
> the design of my XRender-Java2d pipeline. Sorry for all the traffic...
>
> lg Clemens