[External] : Re: MemorySegment APIs for reading and writing strings with known lengths

Mon Nov 10 14:24:57 UTC 2025

On 10/11/2025 14:20, Liam Miller-Cushon wrote:
>
>     I hope this comment was not in my doc?
>
>
> It's a parenthetical in the paragraph starting with "Finally, 
> ultimately, the user is probably the most happy with an API that 
> directly accepts the units in which they are already measuring their 
> string"
Apologies for the confusion, that was a leftover from a previous 
version. Removed now.
>
>     You mean the _byte size_ of the encoded string (rather than number
>     of code units?)
>
> Yes, exactly.
>
>     Something like this might be interesting. That said... if the
>     charset matches, then creating the segment view, then obtaining
>     its byte size is O(1) (e.g. no decoding). And if the charset
>     doesn't match, you'll need to decode anyway -- at which point I'm
>     not sure the array creation is really the bottleneck?
>
> Thanks, yes, MemorySegment.ofString seemingly solves the case where 
> the charset matches, so it's more a question of whether there are 
> performance gains to be had for the case where the charset doesn't 
> match. The benchmarking I've seen suggests a carefully optimized loop 
> over the string is outperforming getBytes(charset).length for that 
> case. I can do some more analysis and report back.

I believe you. My hunch here would be to separate this one out, as it 
has more to do with the Charset/String API than it has to do with memory 
segments?

E.g. you want an API like:

String::getNumBytes(Charset)

Whether this API exists or not seems orthogonal to the improvements 
described in the documents I shared.

Cheers
Maurizio

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/panama-dev/attachments/20251110/b352ca99/attachment.htm>