RFR: 8362893: Improve performance for MemorySegment::getString

Philippe Marschall duke at openjdk.org
Sun Jul 27 12:40:03 UTC 2025


Use `JavaLangAccess::uncheckedNewStringNoRepl` in `MemorySegment::getString` to avoid byte[] allocation in the String constructor.

Fall back to the old code in the case of malformed input to get replacement characters as per Javadoc API specification. The existing tests in `TestStringEncoding` seem sufficient to me.

I ran the tier1 test suite and it passes.

For performance analysis I ran.

    make test TEST="micro:org.openjdk.bench.java.lang.foreign.ToJavaStringTest" MICRO="OPTIONS=-prof gc"

on an AMD Ryzen 7 PRO 5750GE.

These are the formatted results, the current master is the line on top, this feature branch is the line below. We can see an improvement in throughput driven by a reduction in allocation.


Benchmark                                              (size)  Mode  Cnt     Score    Error   Units

ToJavaStringTest.panama_readString                          5  avgt   30    18.996 ±  0.044   ns/op
ToJavaStringTest.panama_readString                          5  avgt   30    13.851 ±  0.028   ns/op

ToJavaStringTest.panama_readString                         20  avgt   30    23.570 ±  0.050   ns/op
ToJavaStringTest.panama_readString                         20  avgt   30    18.401 ±  0.069   ns/op

ToJavaStringTest.panama_readString                        100  avgt   30    32.094 ±  0.207   ns/op
ToJavaStringTest.panama_readString                        100  avgt   30    24.427 ±  0.112   ns/op

ToJavaStringTest.panama_readString                        200  avgt   30    43.029 ±  0.185   ns/op
ToJavaStringTest.panama_readString                        200  avgt   30    31.914 ±  0.064   ns/op

ToJavaStringTest.panama_readString                        451  avgt   30    81.145 ±  0.403   ns/op
ToJavaStringTest.panama_readString                        451  avgt   30    58.975 ±  0.233   ns/op

ToJavaStringTest.panama_readString:gc.alloc.rate.norm       5  avgt   30    72.000 ±  0.001    B/op
ToJavaStringTest.panama_readString:gc.alloc.rate.norm       5  avgt   30    48.000 ±  0.001    B/op

ToJavaStringTest.panama_readString:gc.alloc.rate.norm      20  avgt   30   104.000 ±  0.001    B/op
ToJavaStringTest.panama_readString:gc.alloc.rate.norm      20  avgt   30    64.000 ±  0.001    B/op

ToJavaStringTest.panama_readString:gc.alloc.rate.norm     100  avgt   30   264.000 ±  0.001    B/op
ToJavaStringTest.panama_readString:gc.alloc.rate.norm     100  avgt   30   144.000 ±  0.001    B/op

ToJavaStringTest.panama_readString:gc.alloc.rate.norm     200  avgt   30   456.001 ±  0.001    B/op
ToJavaStringTest.panama_readString:gc.alloc.rate.norm     200  avgt   30   240.000 ±  0.001    B/op

ToJavaStringTest.panama_readString:gc.alloc.rate.norm     451  avgt   30   968.001 ±  0.001    B/op
ToJavaStringTest.panama_readString:gc.alloc.rate.norm     451  avgt   30   496.001 ±  0.001    B/op


I looked into whether there are inlining issues with the current or proposed code. For this I ran


var segment = MemorySegment.ofArray(new byte[] {'c', 'o', 'f', 'f', 'e', 'e', ' ', 'b', 'a', 'b', 'e', 0});
for (int i = 0; i < 200_000_000; i++) {
  var string = segment.getString(0);
  if (System.identityHashCode(string) == 1) {
    System.out.println("x");
   }
}


with `-XX:+PrintCompilation -XX:+PrintInlining -XX:+UnlockDiagnosticVMOptions `

for the current master the simplified output is like this


@ 75   jdk.internal.foreign.AbstractMemorySegmentImpl::getString (9 bytes)   force inline by annotation   callee changed to  help.GetString::main (107 bytes)    -> TypeProfile (95646/95646 counts) = jdk/internal/foreign/HeapMemorySegmentImpl$OfByte
  @ 5   jdk.internal.foreign.AbstractMemorySegmentImpl::getString (12 bytes)   force inline by annotation
    @ 1   java.util.Objects::requireNonNull (14 bytes)   force inline by annotation
    @ 8   jdk.internal.foreign.StringSupport::read (67 bytes)   force inline by annotation
      @ 1   jdk.internal.foreign.StringSupport$CharsetKind::of (102 bytes)   inline (hot)
      @ 4   java.lang.Enum::ordinal (5 bytes)   accessor
      @ 45   jdk.internal.foreign.StringSupport::readByte (41 bytes)   force inline by annotation
        @ 3   jdk.internal.foreign.AbstractMemorySegmentImpl::byteSize (5 bytes)   accessor
        @ 6   jdk.internal.foreign.StringSupport::strlenByte (232 bytes)   force inline by annotation
          <snip>
        @ 27   java.lang.foreign.MemorySegment::copy (29 bytes)   force inline by annotation
          <snip>
        @ 37   java.lang.String::<init> (16 bytes)   inline (hot)
          @ 2   java.util.Objects::requireNonNull (14 bytes)   force inline by annotation
          @ 12   java.lang.String::<init> (86 bytes)   inline (hot)
            @ 23   java.lang.String::utf8 (271 bytes)   inline (hot)
              @ 9   java.lang.StringCoding::countPositives (33 bytes)   (intrinsic)
              @ 27   java.util.Arrays::copyOfRange (82 bytes)   inline (hot)
                @ 11   java.lang.Object::clone (0 bytes)   failed to inline: native method   (intrinsic)
              @ 31   java.lang.String::<init> (15 bytes)   inline (hot)
                @ 1   java.lang.Object::<init> (1 bytes)   inline (hot)
            @ 82   java.lang.String::<init> (37 bytes)   inline (hot)
              @ 1   java.lang.Object::<init> (1 bytes)   inline (hot)


As we can see everything is inlined but since `String::utf8` explicitly asks for a copy of the `byte[]` https://github.com/openjdk/jdk/blob/3263361a28c7e8c02734cb94bc9576e9f3ba5b50/src/java.base/share/classes/java/lang/String.java#L575 this can not be optimized away.

This is the compliation log with the proposed changes


@ 75   jdk.internal.foreign.AbstractMemorySegmentImpl::getString (9 bytes)   force inline by annotation   callee changed to  help.GetString::main (107 bytes)    -> TypeProfile (121249/121249 counts) = jdk/internal/foreign/HeapMemorySegmentImpl$OfByte
  @ 5   jdk.internal.foreign.AbstractMemorySegmentImpl::getString (12 bytes)   force inline by annotation
    @ 1   java.util.Objects::requireNonNull (14 bytes)   force inline by annotation
    @ 8   jdk.internal.foreign.StringSupport::read (67 bytes)   force inline by annotation
      @ 1   jdk.internal.foreign.StringSupport$CharsetKind::of (102 bytes)   inline (hot)
      @ 4   java.lang.Enum::ordinal (5 bytes)   accessor
      @ 45   jdk.internal.foreign.StringSupport::readByte (55 bytes)   force inline by annotation
        @ 3   jdk.internal.foreign.AbstractMemorySegmentImpl::byteSize (5 bytes)   accessor
        @ 6   jdk.internal.foreign.StringSupport::strlenByte (232 bytes)   force inline by annotation
          <snip>
        @ 27   java.lang.foreign.MemorySegment::copy (29 bytes)   force inline by annotation
          <snip>
        @ 36   java.lang.System$1::uncheckedNewStringNoRepl (6 bytes)   inline (hot)
          @ 2   java.lang.String::newStringNoRepl (33 bytes)   inline (hot)
            @ 2   java.lang.String::newStringNoRepl1 (284 bytes)   inline (hot)
              @ 22   java.lang.String::newStringUTF8NoRepl (322 bytes)   inline (hot)
                @ 4   java.lang.String::checkBoundsOffCount (10 bytes)   inline (hot)
                  @ 6   jdk.internal.util.Preconditions::checkFromIndexSize (25 bytes)   inline (hot)
                @ 24   java.lang.StringCoding::countPositives (33 bytes)   (intrinsic)
                @ 73   java.lang.String::<init> (15 bytes)   inline (hot)
                  @ 1   java.lang.Object::<init> (1 bytes)   inline (hot)


everything still inlines even though some methods are larger.

-------------

Commit messages:
 - 8362893: Improve performance for MemorySegment::getString

Changes: https://git.openjdk.org/jdk/pull/26493/files
  Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=26493&range=00
  Issue: https://bugs.openjdk.org/browse/JDK-8362893
  Stats: 19 lines in 1 file changed: 16 ins; 0 del; 3 mod
  Patch: https://git.openjdk.org/jdk/pull/26493.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/26493/head:pull/26493

PR: https://git.openjdk.org/jdk/pull/26493


More information about the core-libs-dev mailing list