RFC 8328877: [JNI] The JNI Specification needs to address the limitations of integer UTF-8 String lengths

Fri Aug 23 05:59:57 UTC 2024

Hi David,

had a read through the CSR.

---

`In addition we tweak the wording of GetStringUTFChars so that it:
...
b) references the new GetStringUTFLengthAsLong function instead of the
Deprecated GetStringUTFLength`

(b) refers to GetStringUTFRegion, or? GetStringUTFChars has no such
wording, nor a len argument

---

I was initially surprised that we return a fake length from
GetStringUTFLength upon overflow instead of a clear error indicator like
-1. Now folks will work with potentially truncated strings. Typically those
are documents stored in string form, and truncation errors are not obvious.
But probably there is no better way:

Returning 0 would be an option - it would cause clearer and more immediate
data errors (missing document contents). But it can be confused with "have
no data" which can be a valid state.
Returning -1 is potentially dangerous and can lead to overflows.
Returning MAX_INT is not much better than returning up to the last valid
encoding, we just get a weird character at the end of the document.

Cheers, Thomas
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-runtime-dev/attachments/20240823/ca4d078f/attachment-0001.htm>