Critical JNI and (Shenandoah) pinning questions

Mon Aug 19 19:52:27 UTC 2019

Hi Ioannis,

Ah, I got what you proposed now. I filed following bug, hopefully, to 
spark some discussions.

https://bugs.openjdk.java.net/browse/JDK-8229895

Thanks,

-Zhengyu

On 8/19/19 12:05 PM, Ioannis Tsakpinis wrote:
> Hey Zhengyu,
> 
>> If there are no array parameters, there is no point to use
>> Get/SetXXXArrayCritical or native critical methods in the first place.
> 
> It's true that CriticalJNINatives were added as an efficient way to
> access Java arrays from JNI code. However, the overhead of JNI calls
> affects all methods, especially methods that accept or return primitive
> values only and the JNI code does nothing but pass the arguments to
> another native function.
> 
> There are thousands of JNI functions in LWJGL and almost all are like
> that, they simply cast arguments to the appropriate type and pass them
> to a target native function. Libraries like JNR and other JNI binding
> generators also look the same.
> 
> The major benefit of using CriticalJNINatives for such functions is the
> removal of the first two standard JNI parameters: JNIEnv* and jclass.
> Normally that would only mean less register pressure, which may help in
> some cases. In practice though, native compilers are able to optimize
> away any argument shuffling and convert everything to a simple
> tail-call, i.e. a single jump instruction.
> 
> We go from this for standard JNI:
> 
> Java -> shuffle arguments -> JNI -> shuffle arguments -> native call
> 
> to this for critical JNI:
> 
> Java -> shuffle arguments -> JNI -> native call
> 
> Example code and assembly output: https://godbolt.org/z/qZRIi1
> 
> This has a measurable effect on JNI call overhead and becomes more
> important the simpler the target native function is. With Project Panama
> there is no JNI function and it should be possible to optimize the first
> argument shuffling too. Until then, this is the best we can do, unless
> there are opportunities to slim down the JNI wrapper even further for
> critical native methods (e.g. remove the safepoint polling if it's safe
> to do so).
> 
> To sum up, the motivation is reduced JNI overhead. My argument is that
> primitive-only functions could benefit from significant overhead
> reduction with CriticalJNINatives. However, the GC locking effect is a
> major and unnecessary disadvantage. Shenandoah does a perfect job here
> because it supports region pinning and there's no actual locking
> happening in primitive-only functions. Every other GC though will choke
> hard with applications that make heavy use of critical natives (such as
> typical LWJGL applications). So, two requests:
> 
> - PRIMARY: Skip check_needs_gc_for_critical_native() in primitive-only
> functions, regardless of GC algorithm and object-pinning support.
> 
> - BONUS: JNI call overhead is significantly higher (3-4ns) on Java 10+
> compared to Java 8 (with or without critical natives). I went through
> the timeline of sharedRuntime_x86_64.cpp but couldn't spot anything that
> would justify such a difference (thread-local handshakes maybe?). I was
> wondering if this is a performance regression that needs to be looked
> into.
> 
> Thank you,
> 
> - Ioannis
>