[External] : Re: MemorySegment APIs for reading and writing strings with known lengths
Maurizio Cimadamore
maurizio.cimadamore at oracle.com
Mon Nov 10 14:24:57 UTC 2025
On 10/11/2025 14:20, Liam Miller-Cushon wrote:
>
> I hope this comment was not in my doc?
>
>
> It's a parenthetical in the paragraph starting with "Finally,
> ultimately, the user is probably the most happy with an API that
> directly accepts the units in which they are already measuring their
> string"
Apologies for the confusion, that was a leftover from a previous
version. Removed now.
>
> You mean the _byte size_ of the encoded string (rather than number
> of code units?)
>
> Yes, exactly.
>
> Something like this might be interesting. That said... if the
> charset matches, then creating the segment view, then obtaining
> its byte size is O(1) (e.g. no decoding). And if the charset
> doesn't match, you'll need to decode anyway -- at which point I'm
> not sure the array creation is really the bottleneck?
>
> Thanks, yes, MemorySegment.ofString seemingly solves the case where
> the charset matches, so it's more a question of whether there are
> performance gains to be had for the case where the charset doesn't
> match. The benchmarking I've seen suggests a carefully optimized loop
> over the string is outperforming getBytes(charset).length for that
> case. I can do some more analysis and report back.
I believe you. My hunch here would be to separate this one out, as it
has more to do with the Charset/String API than it has to do with memory
segments?
E.g. you want an API like:
String::getNumBytes(Charset)
Whether this API exists or not seems orthogonal to the improvements
described in the documents I shared.
Cheers
Maurizio
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/panama-dev/attachments/20251110/b352ca99/attachment.htm>
More information about the panama-dev
mailing list