[PATCH] Allocate uninitialized byte[] to improve performance of String-related operations

Paul Sandoz paul.sandoz at oracle.com
Wed Sep 9 23:22:34 UTC 2020


Hi Sergey,

Unsafe.allocateUninitializedArray should be used with great care, so as we are confident that no array, covering uninitialized memory, leaks out. Or later we unintentionally introduce a problem through refactoring. 

The update to AbstractStringBuilder is problematic because the uninitialized array could potentially be accessed through reflection.

In other cases it seems you have slotted in the allocation at a low-level e.g. StringUTF16.newBytesFor. That makes it hard for me to evaluate the implications, and it's not clear to me it offers advantages in all cases, especially given the risks. The toCharArray/Utf allocation costs appear to be lost in the noise, at least from your benchmark results.

The method repeat seems a good candidate to focus on given the allocation and looping is obviously localized.
However, in the repeat of 1 case I would like to know why C2 does not elide the zeroing of the array, since it should know that the fill covers the whole array. 
For the repeat > 2 we can excuse C2 for not seeing through to the ranges passed to System.arraycopy.

Separately, renaming StringConcatHelper.new{Char}Array to StringConcatHelper.newUninitialized{Char}Array would be clearer, and perhaps moving to somewhere else.

Paul.

> On Sep 4, 2020, at 5:52 AM, Сергей Цыпанов <sergei.tsypanov at yandex.ru> wrote:
> 
> Hello,
> 
> currently jdk.internal.misc.Unsafe declares method allocateUninitializedArray(Class, int) returning uninitialized array of given type and length.
> 
> Allocation of uninitialized arrays is faster especially for larger ones, so we could use them for cases
> when we are sure that created array will be completely overwritten or is guarded by count field (e.g. in AbstractStringBuilder).
> 
> I've exposed jdk.internal.misc.Unsafe.allocateUninitializedArray(Class, int)
> via delegating method of sun.misc.Unsafe to measure creation of byte[] with benchmark [1]
> and got those results:
> 
>                    (length)  Mode  Cnt      Score      Error   Units
> constructor                0  avgt   50      7.639 ±    0.629   ns/op
> constructor               10  avgt   50      9.448 ±    0.725   ns/op
> constructor              100  avgt   50     21.158 ±    1.865   ns/op
> constructor             1000  avgt   50    146.158 ±    9.836   ns/op
> constructor            10000  avgt   50    916.321 ±   50.618   ns/op
> 
> unsafe                     0  avgt   50      8.057 ±    0.599   ns/op
> unsafe                    10  avgt   50      8.308 ±    0.907   ns/op
> unsafe                   100  avgt   50     12.232 ±    1.813   ns/op
> unsafe                  1000  avgt   50     37.679 ±    1.382   ns/op
> unsafe                 10000  avgt   50     78.896 ±    6.758   ns/op
> 
> As a result I propose to add the following methods into StringConcatHelper
> 
> @ForceInline
> static byte[] newArray(int length) {
>    return (byte[]) UNSAFE.allocateUninitializedArray(byte.class, length);
> }
> 
> @ForceInline
> static char[] newCharArray(int length) {
>    return (char[]) UNSAFE.allocateUninitializedArray(char.class, length);
> }
> 
> along with existing StringConcatHelper.newArray(long indexCoder) and utilize them in String-related operations
> instead of conventional array creation with new-keyword.
> 
> I've used benchmark [2] to measure impact on affected String-methods and found quite a good improvement:
> 
>                                                 before             after
> 
> Benchmark                   (length)      Score   Error     Score   Error    Units
> 
> newStringBuilderWithLength         8      8.288 ± 0.411    5.656 ±  0.019    ns/op
> newStringBuilderWithLength        64     12.954 ± 0.687    7.588 ±  0.009    ns/op
> newStringBuilderWithLength       128     20.603 ± 0.412   10.446 ±  0.005    ns/op
> newStringBuilderWithLength      1024    119.935 ± 2.383   35.452 ±  0.029    ns/op
> 
> newStringBuilderWithString         8     19.721 ± 0.375   14.642 ±  0.052    ns/op
> newStringBuilderWithString        64     34.006 ± 1.523   15.479 ±  0.031    ns/op
> newStringBuilderWithString       128     36.697 ± 0.972   17.052 ±  0.133    ns/op
> newStringBuilderWithString      1024    140.486 ± 6.396   85.156 ±  0.175    ns/op
> 
> repeatOneByteString                8     11.340 ± 0.197    9.736 ±  0.051    ns/op
> repeatOneByteString               64     20.859 ± 0.257   15.073 ±  0.024    ns/op
> repeatOneByteString              128     36.311 ± 1.162   22.808 ±  0.198    ns/op
> repeatOneByteString             1024    149.243 ± 3.083   82.839 ±  0.193    ns/op
> 
> repeatOneChar                      8     28.033 ± 0.615   20.377 ±  0.137    ns/op
> repeatOneChar                     64     56.399 ± 1.094   36.230 ±  0.051    ns/op
> repeatOneChar                    128     68.423 ± 5.647   44.157 ±  0.239    ns/op
> repeatOneChar                   1024    230.115 ± 0.312  179.126 ±  0.437    ns/op
> 
> replace                            8     14.684 ± 0.088   14.434 ±  0.057    ns/op
> replace                           64     56.811 ± 0.612   56.420 ±  0.050    ns/op
> replace                          128    112.694 ± 0.404  109.799 ±  1.202    ns/op
> replace                         1024    837.939 ± 0.855  818.802 ±  0.408    ns/op
> 
> replaceUtf                         8     17.802 ± 0.074   15.744 ±  0.094    ns/op
> replaceUtf                        64     45.754 ± 0.139   39.228 ±  0.864    ns/op
> replaceUtf                       128     67.170 ± 0.353   50.497 ±  1.218    ns/op
> replaceUtf                      1024    415.767 ± 6.829  297.831 ± 22.510    ns/op
> 
> toCharArray                        8      6.164 ± 0.033    6.128 ±  0.064    ns/op
> toCharArray                       64     10.960 ± 0.032   13.566 ±  0.802    ns/op
> toCharArray                      128     20.373 ± 0.013   20.823 ±  0.376    ns/op
> toCharArray                     1024    165.923 ± 0.098  164.362 ±  0.065    ns/op
> 
> toCharArrayUTF                     8      8.009 ± 0.067    7.778 ±  0.026    ns/op
> toCharArrayUTF                    64     11.112 ± 0.014   10.880 ±  0.010    ns/op
> toCharArrayUTF                   128     20.390 ± 0.014   20.103 ±  0.017    ns/op
> toCharArrayUTF                  1024    166.233 ± 0.076  163.827 ±  0.099    ns/op
> 
> So the question is could we include the changes of attached patch into JDK?
> 
> With best regards,
> Sergey Tsypanov
> 
> 
> 1. Benchmark for array-allocation
> 
> @BenchmarkMode(Mode.AverageTime)
> @OutputTimeUnit(TimeUnit.NANOSECONDS)
> @Fork(jvmArgsAppend = {"-Xms2g", "-Xmx2g"})
> public class ArrayAllocationBenchmark {
>  private static Unsafe U;
> 
>  static {
>    try {
>      Field f = Unsafe.class.getDeclaredField("theUnsafe");
>      f.setAccessible(true);
>      U = (Unsafe) f.get(null);
>    } catch (Exception e) {
>      new RuntimeException(e);
>    }
>  }
> 
>  @Benchmark
>  public byte[] constructor(Data data) {
>    return new byte[data.length];
>  }
> 
>  @Benchmark
>  public byte[] unsafe(Data data) {
>    return (byte[]) U.allocateUninitializedArray(byte.class, data.length);
>  }
> 
>  @State(Scope.Thread)
>  public static class Data {
>    @Param({"0", "10", "100", "1000", "10000"})
>    private int length;
>  }
> }
> 
> 
> 2. Benchmark for String-method measurements
> 
> @BenchmarkMode(Mode.AverageTime)
> @OutputTimeUnit(TimeUnit.NANOSECONDS)
> @Fork(jvmArgsAppend = {"-Xms2g", "-Xmx2g"})
> public class MiscStringBenchmark {
> 
>  @Benchmark
>  public char[] toCharArrayLatin1(Data data) {
>    return data.string.toCharArray();
>  }
> 
>  @Benchmark
>  public char[] toCharArrayUTF(Data data) {
>    return data.utfString.toCharArray();
>  }
> 
>  @Benchmark
>  public String repeatOneByteString(Data data) {
>    return data.oneCharString.repeat(data.length);
>  }
> 
>  @Benchmark
>  public String repeatOneChar(Data data) {
>    return data.oneUtfCharString.repeat(data.length);
>  }
> 
>  @Benchmark
>  public String replace(Data data){
>    return data.stringToReplace.replace('z', 'b');
>  }
> 
>  @Benchmark
>  public String replaceUtf(Data data){
>    return data.utfStringToReplace.replace('я', 'ю');
>  }
> 
>  @Benchmark
>  public StringBuilder newStringBuilderWithLength(Data data) {
>    return new StringBuilder(data.length);
>  }
> 
>  @Benchmark
>  public StringBuilder newStringBuilderWithString(Data data) {
>    return new StringBuilder(data.string);
>  }
> 
>  @State(Scope.Thread)
>  public static class Data {
> 
>    @Param({"8", "64", "128", "1024"})
>    private int length;
> 
>    private String string;
>    public String utfString;
>    private final String oneCharString = "a";
>    private final String oneUtfCharString = "ё";
>    private String stringToReplace;
>    private String utfStringToReplace;
> 
>    @Setup
>    public void setup() {
>      string = oneCharString.repeat(length);
>      utfString = oneUtfCharString.repeat(length);
> 
>      stringToReplace = string + 'z';
>      utfStringToReplace = utfString + 'я';
>    }
>  }
> }
> <uninit_array.patch>



More information about the core-libs-dev mailing list