RFR: 8369564: Provide a MemorySegment API to read strings with known lengths [v2]
Liam Miller-Cushon
cushon at openjdk.org
Wed Oct 29 16:40:15 UTC 2025
On Wed, 29 Oct 2025 15:35:41 GMT, Jorn Vernee <jvernee at openjdk.org> wrote:
>> Liam Miller-Cushon has updated the pull request incrementally with one additional commit since the last revision:
>>
>> Consolidate duplicate code in read methods
>
> src/java.base/share/classes/java/lang/foreign/MemorySegment.java line 1338:
>
>> 1336: * access operation will occur
>> 1337: * @param length byte length to be used for string conversion (not including any
>> 1338: * null termination)
>
> I think 'to be used for string conversion' is a bit too vague (used _how_?). I think a more descriptive text could be something like 'length in bytes of the string to read' (matching also the pattern of the existing 'offset in bytes').
>
> Also, what happens if:
> - The length _does_ include a null terminator
> - The length is not a multiple of the byte size of a character in the given charset.
>
> On that last note, I wonder if this shouldn't be the length in bytes, but the length in characters. Then we can compute the byte length from the charset. That will make it impossible to pass a length that is not a multiple of the character size.
Thanks for taking a look, I wanted to respond briefly to this part and will review the rest of the comments later:
> I wonder if this shouldn't be the length in bytes, but the length in characters. Then we can compute the byte length from the charset
Part of the motivation here is to support efficiently reading binary formats where I think it's more common to record the length of string data in bytes, than in 16-bit code units in the UTF-16 encoding of the line.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/28043#discussion_r2474212568
More information about the core-libs-dev
mailing list