C2 does not elide the zeroing of the array in String.repeat()

Thu Sep 24 06:59:11 UTC 2020

Hi Sergey,

thanks for the report. Adding some background:

Array fills are supposed to be optimized by C2's -XX:+OptimizeFill optimization which has been fixed 
recently by JDK-8247307 [1]. But as it turned out, that optimization does not generate code as 
efficient as normal loop unrolling + vectorization (Superword). It is therefore currently disabled 
by default.

Unfortunately, neither Superword nor OptimizeFill currently support zeroing elimination of a newly 
allocated array. I've filed JDK-8253577 [2] to keep track of this but it might take a while until 
someone has time to fix it.

Best regards,
Tobias

[1] https://bugs.openjdk.java.net/browse/JDK-8247307
[2] https://bugs.openjdk.java.net/browse/JDK-8253577

On 21.09.20 15:57, Сергей Цыпанов wrote:
> Hello,
> 
> as it appears from https://shipilev.net/blog/2016/arrays-wisdom-ancients/ C2 sometimes can eliminate
> zeroing of newly allocated array (particularly in ArrayList.toArray(T[])).
> 
> However in case of String.repeat() VM does not elide the zeroing of the array even in case when
> repeated String is represented with 1 byte:
> 
> if (len == 1) {
>    final byte[] single = new byte[count];
>    Arrays.fill(single, value[0]);
>    return new String(single, coder);
> }
> 
> Here we are sure that the array is localized and for sure will be completely filled, however zeroing is present.
> 
> When I run the benchmark [1] with fresh-built JDK it gives
> 
>                       (length)  Mode  Cnt    Score    Error  Units
> repeatOneByteString         8  avgt   50   14.020 ±  1.928  ns/op
> repeatOneByteString        64  avgt   50   24.618 ±  2.712  ns/op
> repeatOneByteString       128  avgt   50   36.555 ±  1.394  ns/op
> repeatOneByteString      1024  avgt   50  134.731 ±  7.022  ns/op
> 
> then if in String.repeat() I replace
> 
> final byte[] single = new byte[count];
> 
> with
> 
> final byte[] single = StringConcatHelper.newArray(count);
> 
> where StringConcatHelper.newArray(int) delegates directly to UNSAFE.allocateUninitializedArray(Class, int),
> the same benchmark demonstrates good improvement:
> 
>                       (length)  Mode  Cnt    Score    Error  Units
> repeatOneByteString         8  avgt   50   12.545 ±  0.164  ns/op
> repeatOneByteString        64  avgt   50   18.393 ±  0.686  ns/op
> repeatOneByteString       128  avgt   50   25.550 ±  0.378  ns/op
> repeatOneByteString      1024  avgt   50   90.454 ±  1.015  ns/op
> 
> So the question is whether there's an issue in C2 (and whether it is fixeable) or not?
> 
> Originally the question appeared  in https://mail.openjdk.java.net/pipermail/core-libs-dev/2020-September/068641.html
> 
> Cheers,
> Sergey Tsypanov
> 
> 1.
> 
> @BenchmarkMode(Mode.AverageTime)
> @OutputTimeUnit(TimeUnit.NANOSECONDS)
> @Fork(jvmArgsAppend = {"-Xms2g", "-Xmx2g"})
> public class MiscStringBenchmark {
> 
>    @Benchmark
>    public String repeatOneByteString(Data data) {
>      return data.oneByteString.repeat(data.length);
>    }
> 
>    @State(Scope.Thread)
>    public static class Data {
>      @Param({"8", "64", "128", "1024"})
>      private int length;
>      private final String oneByteString = "a";
> 
>    }
> }
>