FFM - Optimizing string reads in MemorySegment
Per-Ake Minborg
per-ake.minborg at oracle.com
Mon Jul 21 15:02:22 UTC 2025
Hi Liam!
Thank you for your comments on MemorySegment::getString !
The intention was that the creation of the intermediate array should be elided, but as you point out, that is (at least sometimes) not the case. I've made an experimental patch where I've used JavaLangAccess::uncheckedNewStringNoRepl in StringSupport which looks promising:
Base:
ToJavaStringTest.panama_readString 5 avgt 30 13.559 ± 0.246 ns/op
ToJavaStringTest.panama_readString 20 avgt 30 67.047 ± 7.541 ns/op <- Slow
ToJavaStringTest.panama_readString 100 avgt 30 28.092 ± 0.547 ns/op
ToJavaStringTest.panama_readString 200 avgt 30 37.679 ± 0.645 ns/op
ToJavaStringTest.panama_readString 451 avgt 30 64.544 ± 1.096 ns/op
Experimental Patch:
ToJavaStringTest.panama_readString 5 avgt 30 8.847 ± 0.178 ns/op
ToJavaStringTest.panama_readString 20 avgt 30 20.332 ± 4.495 ns/op
ToJavaStringTest.panama_readString 100 avgt 30 21.808 ± 0.315 ns/op
ToJavaStringTest.panama_readString 200 avgt 30 28.711 ± 0.105 ns/op
ToJavaStringTest.panama_readString 451 avgt 30 51.806 ± 1.067 ns/op
So, this indicates that there are gains to be made here, although the ultimate solution might be something else (such as improving inlining). I have raised an issue for this<https://bugs.openjdk.org/browse/JDK-8362893>.
Thanks again for bringing this to the general knowledge.
Best,
Per Minborg
Oracle Core Libraries
________________________________
From: core-libs-dev <core-libs-dev-retn at openjdk.org> on behalf of Liam Miller-Cushon <cushon at google.com>
Sent: Friday, July 18, 2025 7:26 PM
To: OpenJDK Core Libs Developers <core-libs-dev at openjdk.java.net>
Subject: FFM - Optimizing string reads in MemorySegment
Hi,
I have a question about the performance of string reads in MemorySegment.
There was some discussion about reading strings with known lengths in JDK-8333843, and the outcome was to update the getString javadoc to recommend:
byte[] bytes = new byte[length];
MemorySegment.copy(segment, JAVA_BYTE, offset, bytes, 0, length);
return new String(bytes, charset);
As I understand it that's as efficient as what MemorySegment#getString is doing under the hood, the main difference with getString is that it's first reading the length of a null-terminated string, but after that it just uses MemorySegment.copy and calls the String constructor.
Has avoiding the byte[] copy in the String constructor been considered?
In theory it might be possible for the runtime to inline the String constructor, see that the byte[] doesn't escape, and skip the copy. In practice, based on some benchmarking my colleague Louis Wasserman was doing, that doesn't seem to be happening.
Would it make sense to consider using JavaLangAccess#uncheckedNewStringNoRepl here to avoid having to copy the byte[]?
Thanks,
Liam
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/core-libs-dev/attachments/20250721/75a23eef/attachment.htm>
More information about the core-libs-dev
mailing list