RFR(M): 8189113: AARCH64: StringLatin1 inflate intrinsic doesn't use prefetch instruction

Andrew Haley aph at redhat.com
Tue May 15 17:56:56 UTC 2018


Again, before, running in L1:

Benchmark                (ALL)  (size)  Mode  Cnt    Score   Error  Units
StrInflateBench.inflate  32768       8  avgt   10   53.875 ± 0.088  ns/op
StrInflateBench.inflate  32768      32  avgt   10   58.149 ± 0.735  ns/op
StrInflateBench.inflate  32768     256  avgt   10  125.529 ± 0.353  ns/op

After:

Benchmark                (ALL)  (size)  Mode  Cnt    Score   Error  Units
StrInflateBench.inflate  32768       8  avgt   10   50.541 ± 0.029  ns/op
StrInflateBench.inflate  32768      32  avgt   10   55.591 ± 0.393  ns/op
StrInflateBench.inflate  32768     256  avgt   10  108.823 ± 1.742  ns/op

Before, missing L1:

Benchmark                  (ALL)  (size)  Mode  Cnt    Score   Error  Units
StrInflateBench.inflate  1000000       8  avgt   10   57.685 ± 0.225  ns/op
StrInflateBench.inflate  1000000      32  avgt   10   90.418 ± 0.172  ns/op
StrInflateBench.inflate  1000000     256  avgt   10  293.611 ± 1.314  ns/op

After:

Benchmark                  (ALL)  (size)  Mode  Cnt    Score   Error  Units
StrInflateBench.inflate  1000000       8  avgt   10   54.611 ± 0.122  ns/op
StrInflateBench.inflate  1000000      32  avgt   10  103.166 ± 0.757  ns/op
StrInflateBench.inflate  1000000     256  avgt   10  237.011 ± 0.703  ns/op

I don't like one thing: the very high overhead.  The fact that the timing
is never less than 50ns, even when running inside l1, is not pleasing.
None of this is your fault: it seems to be all of the messing about
which happens before the intrinsic gets called.

This is OK.

-- 
Andrew Haley
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


More information about the hotspot-compiler-dev mailing list