RFR: 8369564: Provide a MemorySegment API to read strings with known lengths [v11]

Liam Miller-Cushon cushon at openjdk.org
Fri Nov 21 14:57:43 UTC 2025


On Fri, 21 Nov 2025 14:30:53 GMT, Maurizio Cimadamore <mcimadamore at openjdk.org> wrote:

>> Liam Miller-Cushon has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   Use Utils.checkNonNegativeArgument
>
> src/java.base/share/classes/java/lang/foreign/MemorySegment.java line 1333:
> 
>> 1331:      * sequences with this charset's default replacement string. The {@link
>> 1332:      * java.nio.charset.CharsetDecoder} class should be used when more control
>> 1333:      * over the decoding process is required.
> 
> Should we say here, as you did for `copy` that this method ignores `\0` ?

I added:

> If the string contains any {@code '\0'} characters, they will be read as well.

I suppose it might also make sense to update those warnings in `setString` and `allocateFrom` to mention that if you want to avoid truncating null-terminated strings, `getString(long, Charset, long)` could be used instead of `getString(long)`. What do you think?

> src/java.base/share/classes/java/lang/foreign/MemorySegment.java line 1341:
> 
>> 1339:      * @param length  length in bytes of the string to read
>> 1340:      * @return a Java string constructed from the bytes read from the given starting
>> 1341:      *         address reading the given length of bytes
> 
> Maybe:
> 
> 
> a Java string constructed from the bytes read from the given starting address up to the given length
> 
> 
> (that seems to match the existing `getString` a bit more, and avoids the reading/read repetition)

Done

> src/java.base/share/classes/java/lang/foreign/MemorySegment.java line 2646:
> 
>> 2644:      * will appear truncated when read again.
>> 2645:      *
>> 2646:      * @param src      the Java string to be written into this segment
> 
> Suggestion:
> 
>      * @param src      the Java string to be written into the destination segment

Done

> src/java.base/share/classes/java/lang/foreign/MemorySegment.java line 2649:
> 
>> 2647:      * @param dstEncoding the charset used to {@linkplain Charset#newEncoder() encode}
>> 2648:      *                 the string bytes.
>> 2649:      * @param srcIndex the starting index of the source string
> 
> Do we feel like saying words like "index" is good enough to say what we mean? E.g. by index we mean something that is compatible with `length` and `charAt`. I wonder if using a qualifier e.g. `character index` might help? (open to suggestions here, I'm not 100% sure)

Thanks, I think 'character index' is probably better than just index, since there are both character and byte positions in this API. The docs for numChars also mentions characters. I don't have a better idea for avoiding confusion here.

> src/java.base/share/classes/java/lang/foreign/SegmentAllocator.java line 157:
> 
>> 155: 
>> 156:     /**
>> 157:      * Converts a Java string into a C string using the provided charset,
> 
> I think using the term `C string` here is misleading. 
> 
> I suggest:
> 
> 
> Encodes a Java string using the provided charset and stores the resulting byte array into a memory segment.

Thanks, done

> src/java.base/share/classes/java/lang/foreign/SegmentAllocator.java line 170:
> 
>> 168:      * will appear truncated when read again.
>> 169:      *
>> 170:      * @param str      the Java string to be converted into a C string
> 
> Suggestion:
> 
>      * @param str      the Java string to be encoded

Done

> src/java.base/share/classes/java/lang/foreign/SegmentAllocator.java line 175:
> 
>> 173:      * @param srcIndex the starting index of the source string
>> 174:      * @param numChars the number of characters to be copied
>> 175:      * @return a new native segment containing the converted C string
> 
> Watch out for C string again

Done

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/28043#discussion_r2549972828
PR Review Comment: https://git.openjdk.org/jdk/pull/28043#discussion_r2549994262
PR Review Comment: https://git.openjdk.org/jdk/pull/28043#discussion_r2549973821
PR Review Comment: https://git.openjdk.org/jdk/pull/28043#discussion_r2549982045
PR Review Comment: https://git.openjdk.org/jdk/pull/28043#discussion_r2549987338
PR Review Comment: https://git.openjdk.org/jdk/pull/28043#discussion_r2550014277
PR Review Comment: https://git.openjdk.org/jdk/pull/28043#discussion_r2550015835


More information about the core-libs-dev mailing list