RFR: 8369564: Provide a MemorySegment API to read strings with known lengths [v2]

Wed Oct 29 17:42:38 UTC 2025

On Wed, 29 Oct 2025 16:37:55 GMT, Liam Miller-Cushon <cushon at openjdk.org> wrote:

>> src/java.base/share/classes/java/lang/foreign/MemorySegment.java line 1338:
>> 
>>> 1336:      *                access operation will occur
>>> 1337:      * @param length  byte length to be used for string conversion (not including any
>>> 1338:      *                null termination)
>> 
>> I think 'to be used for string conversion' is a bit too vague (used _how_?). I think a more descriptive text could be something like 'length in bytes of the string to read' (matching also the pattern of the existing 'offset in bytes').
>> 
>> Also, what happens if:
>> - The length _does_ include a null terminator
>> - The length is not a multiple of the byte size of a character in the given charset.
>> 
>> On that last note, I wonder if this shouldn't be the length in bytes, but the length in characters. Then we can compute the byte length from the charset. That will make it impossible to pass a length that is not a multiple of the character size.
>
> Thanks for taking a look, I wanted to respond briefly to this part and will review the rest of the comments later:
> 
>> I wonder if this shouldn't be the length in bytes, but the length in characters. Then we can compute the byte length from the charset
> 
> Part of the motivation here is to support efficiently reading binary formats where I think it's more common to record the length of string data in bytes, than in 16-bit code units in the UTF-16 encoding of the line.

Discussed this with the team as well. For cases of native interop it seems more likely that you'd have e.g. an array of `wchar_t` on the native side, and you are tracking the length of that array, not the byte size.

A user can easily convert between one or the other length representation by multiplying/dividing by the right scalar, but if the length is specified in bytes, the API has an extra error case we need to check and specify.

Either way, we felt that it would be a good idea if you could send an email to `panama-dev` in which you describe your exact use case, before getting further into the code review. That would give others a chance to respond with their use cases as well.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/28043#discussion_r2474373352