RFR: 8215017: Improve String::equals warmup characteristics

Claes Redestad claes.redestad at oracle.com
Mon Dec 10 17:36:05 UTC 2018


Hi,

Tobias weighed in on this in another thread[1], and while he thinks the
proposed patch is semantically correct, concerns was raised that maybe
the UTF16 intrinsics could be superior (on some platforms).

I ran the microbenchmark below as well as existing string-density-
benchmark[2] suite, noting no statistically significant differences for
peak performance on x64_86 (Windows, Linux, Mac) and SPARC T4 through
M7. Warmup improvements are similar across all platforms.

So from our point of view things look green.

However, I have no means of testing the intrinsics on other platforms
(S390, aarch64, ppc), so it'd be much appreciated if performance could
be verified on those platforms using the proposed patch and benchmark.

By inspecting code it seems the difference should be negligible or even
positive, e.g., on aarch64 there's a trailing comparison that is elided
when treating the byte[] as a char[] - overhead that is possibly offset
entirely by removing an extra branch before going into the intrinsics.

Thanks!

/Claes

[1] 
http://mail.openjdk.java.net/pipermail/core-libs-dev/2018-December/057240.html
[2] http://cr.openjdk.java.net/~shade/density/string-density-bench.zip 
(had to remove hg maven plugin, reference to sun.misc.Version and update 
JMH version for this to build and run on latest JDK)

On 2018-12-08 01:11, Claes Redestad wrote:
> Hi,
> 
> following up from discussions during review of JDK-8214971[1], I
> examined the startup and peak performance of a few different variant of
> writing String::equals.
> 
> Webrev: http://cr.openjdk.java.net/~redestad/8215017/jdk.00/
> Bug: https://bugs.openjdk.java.net/browse/JDK-8215017
> 
> - folding coder() == aString.coder() into sameCoder(aString) helps
> interpreter without adversely affecting higher optimization levels
> 
> - Jim's proposal to use Arrays.equals is _interesting_: it improves
> peak performance on some inputs but regress it on others. I'll defer
> that to a future RFE as it needs a more thorough examination.
> 
> - what we can do is simplify to only use StringLatin1.equals. If I'm not
> completely mistaken these are all semantically neutral (and
> StringUTF16.equals effectively redundant). If I'm completely wrong here
> I'll welcome the learning opportunity. :-)
> 
> This removes a branch and two method calls, and for UTF16 Strings we'll
> use a simpler algorithm early, which turns out to be beneficial during
> interpreter and C1 level.
> 
> I added a simple microbenchmark to explore this, results show 1.2-2.5x
> improvements in interpreter performance, while remaining perfectly
> neutral results for optimized code on this simple micro[2].
> 
> This could be extended to clean up and move StringLatin1.equals back
> into String and remove StringUTF16, but we'd also need to rearrange the
> intrinsics on the VM side. Let me know what you think.
> 
> Thanks!
> 
> /Claes
> 
> [1] 
> http://mail.openjdk.java.net/pipermail/core-libs-dev/2018-December/057162.html 
> 
> 
> [2]
> ========== Baseline =================
> 
> -Xint
> Benchmark                            Mode  Cnt     Score    Error  Units
> StringEquals.equalsAlmostEqual       avgt    4   968.640 ±  1.337  ns/op
> StringEquals.equalsAlmostEqualUTF16  avgt    4  2082.007 ±  5.303  ns/op
> StringEquals.equalsDifferent         avgt    4   583.166 ± 29.461  ns/op
> StringEquals.equalsDifferentCoders   avgt    4   422.993 ±  1.291  ns/op
> StringEquals.equalsEqual             avgt    4   988.671 ±  1.492  ns/op
> StringEquals.equalsEqualsUTF16       avgt    4  2103.060 ±  5.705  ns/op
> 
> -XX:+CompactStrings
> Benchmark                            Mode  Cnt   Score   Error  Units
> StringEquals.equalsAlmostEqual       avgt    4  23.896 ± 0.089  ns/op
> StringEquals.equalsAlmostEqualUTF16  avgt    4  23.935 ± 0.562  ns/op
> StringEquals.equalsDifferent         avgt    4  15.086 ± 0.044  ns/op
> StringEquals.equalsDifferentCoders   avgt    4  12.572 ± 0.008  ns/op
> StringEquals.equalsEqual             avgt    4  25.143 ± 0.025  ns/op
> StringEquals.equalsEqualsUTF16       avgt    4  25.148 ± 0.021  ns/op
> 
> -XX:-CompactStrings
> Benchmark                            Mode  Cnt   Score   Error  Units
> StringEquals.equalsAlmostEqual       avgt    4  24.539 ± 0.127  ns/op
> StringEquals.equalsAlmostEqualUTF16  avgt    4  22.638 ± 0.047  ns/op
> StringEquals.equalsDifferent         avgt    4  13.930 ± 0.835  ns/op
> StringEquals.equalsDifferentCoders   avgt    4  13.836 ± 0.025  ns/op
> StringEquals.equalsEqual             avgt    4  26.420 ± 0.020  ns/op
> StringEquals.equalsEqualsUTF16       avgt    4  23.889 ± 0.037  ns/op
> 
> ========== Fix ======================
> 
> -Xint
> Benchmark                            Mode  Cnt    Score     Error  Units
> StringEquals.equalsAlmostEqual       avgt    4  811.859 ±   8.663  ns/op
> StringEquals.equalsAlmostEqualUTF16  avgt    4  802.784 ± 352.884  ns/op
> StringEquals.equalsDifferent         avgt    4  431.837 ±   1.884  ns/op
> StringEquals.equalsDifferentCoders   avgt    4  358.244 ±   1.208  ns/op
> StringEquals.equalsEqual             avgt    4  832.056 ±   3.541  ns/op
> StringEquals.equalsEqualsUTF16       avgt    4  832.434 ±   3.516  ns/op
> 
> -XX:+CompactStrings
> Benchmark                            Mode  Cnt   Score   Error  Units
> StringEquals.equalsAlmostEqual       avgt    4  23.906 ± 0.151  ns/op
> StringEquals.equalsAlmostEqualUTF16  avgt    4  23.905 ± 0.123  ns/op
> StringEquals.equalsDifferent         avgt    4  15.088 ± 0.023  ns/op
> StringEquals.equalsDifferentCoders   avgt    4  12.575 ± 0.030  ns/op
> StringEquals.equalsEqual             avgt    4  25.149 ± 0.059  ns/op
> StringEquals.equalsEqualsUTF16       avgt    4  25.149 ± 0.033  ns/op
> 
> -XX:-CompactStrings
> Benchmark                            Mode  Cnt   Score   Error  Units
> StringEquals.equalsAlmostEqual       avgt    4  24.521 ± 0.050  ns/op
> StringEquals.equalsAlmostEqualUTF16  avgt    4  22.639 ± 0.035  ns/op
> StringEquals.equalsDifferent         avgt    4  13.831 ± 0.020  ns/op
> StringEquals.equalsDifferentCoders   avgt    4  13.884 ± 0.345  ns/op
> StringEquals.equalsEqual             avgt    4  26.395 ± 0.066  ns/op
> StringEquals.equalsEqualsUTF16       avgt    4  23.904 ± 0.112  ns/op


More information about the core-libs-dev mailing list