<!DOCTYPE html><html><head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body>
<p>I've had some internal feedback which has been incorporated in
the CSR request:</p>
<p><a class="moz-txt-link-freetext" href="https://bugs.openjdk.org/browse/JDK-8338709">https://bugs.openjdk.org/browse/JDK-8338709</a></p>
<p>Proposed name of the new function is now <font face="monospace">GetStringUTFLengthAsLong</font>.<br>
</p>
<p>David<br>
</p>
<div class="moz-cite-prefix">On 20/08/2024 8:34 am, David Holmes
wrote:<br>
</div>
<blockquote type="cite" cite="mid:b9c19cb0-f0d9-45a1-9e3c-e580375b0a35@oracle.com">
<p>Broadening the audience to hotspot-dev as zero response on
hotspot-runtime-dev.</p>
<p>David<br>
</p>
<div class="moz-cite-prefix">On 13/08/2024 4:12 pm, David Holmes
wrote:<br>
</div>
<blockquote type="cite" cite="mid:a57fb7cd-e1ca-4895-abd3-cbd62924fd73@oracle.com">
<p>Comment is sought on this proposed updated to the JNI
Specification</p>
<p><a class="moz-txt-link-freetext" href="https://bugs.openjdk.org/browse/JDK-8328877" moz-do-not-send="true">https://bugs.openjdk.org/browse/JDK-8328877</a><br>
</p>
<p>The modified UTf-8 format used by the VM can lead to UTF-8
sequences that exceed the maximum value of an int, due to
multi-byte encoding, but the JNI GetStringUTFLength returns a
jsize, which is (perhaps incorrectly) a jint ie. an int. As a
result the current implementation will return a truncated
version of the length of the sequence. To address this we
propose to do two things in the JNI spec:</p>
<p>1. We Deprecate GetStringUTFLength</p>
<pre>+### GetStringUTFLength (Deprecated)
`jsize GetStringUTFLength(JNIEnv *env, jstring string);`
Returns the length in bytes of the modified UTF-8 representation of a string.
+As the capacity of a `jsize` variable is not sufficient to hold the length of
+all possible modified UTF-8 string representations (due to multi-byte encodings)
+this function is deprecated in favor of [`GetLargeStringUTFLength()`](#getlargestringutflength).
+If the modified UTF-8 representation of `string` has a length that exceeds the capacity
+of a `jsize` variable, then the length as of the last character that could be fully
+encoded without exceeding that capacity, is returned.
</pre>
<p>2. We add a new function GetLargeStringUTFLength</p>
<pre>+### GetLargeStringUTFLength
+
+`jlong GetLargeStringUTFLength(JNIEnv *env, jstring string);`
+
+Returns the complete length in bytes of the modified UTF-8 representation of a string.
</pre>
<p>In addition we tweak the wording of GetStringUTFChars so that
it:</p>
<p>a) refers to a byte sequence instead of a byte array (to
avoid suggesting the returned sequence is limited by the
capacity of a Java array); and</p>
<p>b) references the new GetLargeStringUTFLength function
instead of the Deprecated GetStringUTFLength</p>
<p>Note that GetStringUTFRegion is still using an int length so
can't be used to obtain a giant region, but we don't expect
this to be a practical concern.<br>
</p>
<p>The JNI version will also be bumped for this API addition.<br>
</p>
<p>Thanks,<br>
David<br>
</p>
<p><br>
</p>
</blockquote>
</blockquote>
</body>
</html>