RFR(M): 8189113: AARCH64: StringLatin1 inflate intrinsic doesn't use prefetch instruction
Andrew Haley
aph at redhat.com
Tue May 15 17:56:56 UTC 2018
Again, before, running in L1:
Benchmark (ALL) (size) Mode Cnt Score Error Units
StrInflateBench.inflate 32768 8 avgt 10 53.875 ± 0.088 ns/op
StrInflateBench.inflate 32768 32 avgt 10 58.149 ± 0.735 ns/op
StrInflateBench.inflate 32768 256 avgt 10 125.529 ± 0.353 ns/op
After:
Benchmark (ALL) (size) Mode Cnt Score Error Units
StrInflateBench.inflate 32768 8 avgt 10 50.541 ± 0.029 ns/op
StrInflateBench.inflate 32768 32 avgt 10 55.591 ± 0.393 ns/op
StrInflateBench.inflate 32768 256 avgt 10 108.823 ± 1.742 ns/op
Before, missing L1:
Benchmark (ALL) (size) Mode Cnt Score Error Units
StrInflateBench.inflate 1000000 8 avgt 10 57.685 ± 0.225 ns/op
StrInflateBench.inflate 1000000 32 avgt 10 90.418 ± 0.172 ns/op
StrInflateBench.inflate 1000000 256 avgt 10 293.611 ± 1.314 ns/op
After:
Benchmark (ALL) (size) Mode Cnt Score Error Units
StrInflateBench.inflate 1000000 8 avgt 10 54.611 ± 0.122 ns/op
StrInflateBench.inflate 1000000 32 avgt 10 103.166 ± 0.757 ns/op
StrInflateBench.inflate 1000000 256 avgt 10 237.011 ± 0.703 ns/op
I don't like one thing: the very high overhead. The fact that the timing
is never less than 50ns, even when running inside l1, is not pleasing.
None of this is your fault: it seems to be all of the messing about
which happens before the intrinsic gets called.
This is OK.
--
Andrew Haley
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671
More information about the hotspot-compiler-dev
mailing list