RFC 8328877: [JNI] The JNI Specification needs to address the limitations of integer UTF-8 String lengths
David Holmes
david.holmes at oracle.com
Tue Aug 13 06:12:38 UTC 2024
Comment is sought on this proposed updated to the JNI Specification
https://bugs.openjdk.org/browse/JDK-8328877
The modified UTf-8 format used by the VM can lead to UTF-8 sequences
that exceed the maximum value of an int, due to multi-byte encoding, but
the JNI GetStringUTFLength returns a jsize, which is (perhaps
incorrectly) a jint ie. an int. As a result the current implementation
will return a truncated version of the length of the sequence. To
address this we propose to do two things in the JNI spec:
1. We Deprecate GetStringUTFLength
+### GetStringUTFLength (Deprecated)
`jsize GetStringUTFLength(JNIEnv *env, jstring string);`
Returns the length in bytes of the modified UTF-8 representation of a string.
+As the capacity of a `jsize` variable is not sufficient to hold the length of
+all possible modified UTF-8 string representations (due to multi-byte encodings)
+this function is deprecated in favor of [`GetLargeStringUTFLength()`](#getlargestringutflength).
+If the modified UTF-8 representation of `string` has a length that exceeds the capacity
+of a `jsize` variable, then the length as of the last character that could be fully
+encoded without exceeding that capacity, is returned.
2. We add a new function GetLargeStringUTFLength
+### GetLargeStringUTFLength
+
+`jlong GetLargeStringUTFLength(JNIEnv *env, jstring string);`
+
+Returns the complete length in bytes of the modified UTF-8 representation of a string.
In addition we tweak the wording of GetStringUTFChars so that it:
a) refers to a byte sequence instead of a byte array (to avoid
suggesting the returned sequence is limited by the capacity of a Java
array); and
b) references the new GetLargeStringUTFLength function instead of the
Deprecated GetStringUTFLength
Note that GetStringUTFRegion is still using an int length so can't be
used to obtain a giant region, but we don't expect this to be a
practical concern.
The JNI version will also be bumped for this API addition.
Thanks,
David
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-runtime-dev/attachments/20240813/d7276ce3/attachment-0001.htm>
More information about the hotspot-runtime-dev
mailing list