RFR (XS) JNI Specification Issue: JDK-6681965 Strings and NULL termination

David Simms david.simms at oracle.com
Wed May 14 08:46:01 UTC 2014


Bug/Enhancement: https://bugs.openjdk.java.net/browse/JDK-6681965
Web review: http://cr.openjdk.java.net/~dsimms/jnispec/6681965/
HTML version: 
http://cr.openjdk.java.net/~dsimms/jnispec/6681965/JDK-6681965.html

There are number of mail threads asking for clarification surrounding 
string operations, and whether the results should be NULL terminated. 
Functions affected:

  * const jchar * *GetStringChars*(JNIEnv *env, jstring string, jboolean
    *isCopy);
  * const char * *GetStringUTFChars*(JNIEnv *env, jstring string,
    jboolean *isCopy);
  * void *GetStringRegion*(JNIEnv *env, jstring str, jsize start, jsize
    len, jchar *buf);
  * void *GetStringUTFRegion*(JNIEnv *env, jstring str, jsize start,
    jsize len, char *buf);
  * const jchar * *GetStringCritical*(JNIEnv *env, jstring string,
    jboolean *isCopy);

There are two stories here, how the particular JVM implements Java 
Strings internally, and the JNI Specification. I believe these two 
things are separate:

  *  From a HotSpot JVM perspective, null-terminating strings might be
    seen as "safety first".
  * There may be other JVM implementations whom, for argument sake, that
    never garbage collect and always use direct heap pointers. Such
    implementations have no reason to null-terminate...
  * ...indeed HotSpot's "GetStringCritical()" implementation does no
    such thing.

Currently the JNI Specification does not mention null-termination, and 
given we are not dealing with c-strings, but Java Strings, that is 
because it's not required.

"GetStringUTFRegion()" as currently implemented by HotSpot has an issue, 
it is possible to null-terminate past the user specified length (which 
in itself a bit of an issue, its related to the input, not output, being 
UTF-16 length != UT8 encoded length).

I'm not advocating removing the current null-termination where HotSpot 
supplies the buffers. In fact I'm worried will break too much existing 
code. There is no harm in null-termination where the JVM has control 
over the buffer.

Clarify NULL-termination, or rather lack of, for string operations. 
Additional text at the beginning of the "String Operations" section:

----------------------------------------------------------------------
String Operations

This specification makes no assumptions on how a JVM represents Java 
strings internally. Strings returned from these operations:

  * GetStringChars()
  * GetStringUTFChars()
  * GetStringRegion()
  * GetStringUTFRegion()
  * GetStringCritical()

are therefore not required to be NULL terminated. Programmers are 
expected to determine buffer capacity requirements 
via"|GetStringLength()| 
<imap://david%2Esimms@mail.oracle.com:993/fetch%3EUID%3E/jvm_runtime_grp%40oracle.com%3E1521?part=1.2.2&filename=JNI%20Specification%20Issue:%20JDK-6681965%20Strings%20and%20NULL%20termination.html>" 
or "|GetStringUTFLength()| 
<imap://david%2Esimms@mail.oracle.com:993/fetch%3EUID%3E/jvm_runtime_grp%40oracle.com%3E1521?part=1.2.3&filename=JNI%20Specification%20Issue:%20JDK-6681965%20Strings%20and%20NULL%20termination.html>".

----------------------------------------------------------------------

/David Simms


More information about the hotspot-dev mailing list