FFM - Optimizing string reads in MemorySegment

Fri Jul 18 17:26:24 UTC 2025

Hi,

I have a question about the performance of string reads in MemorySegment.

There was some discussion about reading strings with known lengths
in JDK-8333843, and the outcome was to update the getString javadoc to
recommend:

byte[] bytes = new byte[length];
MemorySegment.copy(segment, JAVA_BYTE, offset, bytes, 0, length);
return new String(bytes, charset);

As I understand it that's as efficient as what MemorySegment#getString is
doing under the hood, the main difference with getString is that it's first
reading the length of a null-terminated string, but after that it just uses
MemorySegment.copy and calls the String constructor.

Has avoiding the byte[] copy in the String constructor been considered?

In theory it might be possible for the runtime to inline the String
constructor, see that the byte[] doesn't escape, and skip the copy. In
practice, based on some benchmarking my colleague Louis Wasserman was
doing, that doesn't seem to be happening.

Would it make sense to consider using
JavaLangAccess#uncheckedNewStringNoRepl here to avoid having to copy the
byte[]?

Thanks,
Liam
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/core-libs-dev/attachments/20250718/91063678/attachment-0001.htm>