RFC 8328877: [JNI] The JNI Specification needs to address the limitations of integer UTF-8 String lengths

David Holmes david.holmes at oracle.com
Tue Aug 13 06:12:38 UTC 2024


Comment is sought on this proposed updated to the JNI Specification

https://bugs.openjdk.org/browse/JDK-8328877

The modified UTf-8 format used by the VM can lead to UTF-8 sequences 
that exceed the maximum value of an int, due to multi-byte encoding, but 
the JNI GetStringUTFLength returns a jsize, which is (perhaps 
incorrectly) a jint ie. an int. As a result the current implementation 
will return a truncated version of the length of the sequence. To 
address this we propose to do two things in the JNI spec:

1. We Deprecate GetStringUTFLength

+### GetStringUTFLength (Deprecated)
  
  `jsize GetStringUTFLength(JNIEnv *env, jstring string);`
  
  Returns the length in bytes of the modified UTF-8 representation of a string.
  
+As the capacity of a `jsize` variable is not sufficient to hold the length of
+all possible modified UTF-8 string representations (due to multi-byte encodings)
+this function is deprecated in favor of [`GetLargeStringUTFLength()`](#getlargestringutflength).
+If the modified UTF-8 representation of `string` has a length that exceeds the capacity
+of a `jsize` variable, then the length as of the last character that could be fully
+encoded without exceeding that capacity, is returned.

2. We add a new function GetLargeStringUTFLength

+### GetLargeStringUTFLength
+
+`jlong GetLargeStringUTFLength(JNIEnv *env, jstring string);`
+
+Returns the complete length in bytes of the modified UTF-8 representation of a string.

In addition we tweak the wording of GetStringUTFChars so that it:

a) refers to a byte sequence instead of a byte array (to avoid 
suggesting the returned sequence is limited by the capacity of a Java 
array); and

b) references the new GetLargeStringUTFLength function instead of the 
Deprecated GetStringUTFLength

Note that GetStringUTFRegion is still using an int length so can't be 
used to obtain a giant region, but we don't expect this to be a 
practical concern.

The JNI version will also be bumped for this API addition.

Thanks,
David

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-runtime-dev/attachments/20240813/d7276ce3/attachment-0001.htm>


More information about the hotspot-runtime-dev mailing list