RFR: 8321468: Remove StringUTF16::equals
Claes Redestad
redestad at openjdk.org
Thu Dec 7 10:59:42 UTC 2023
On Wed, 6 Dec 2023 14:20:14 GMT, Claes Redestad <redestad at openjdk.org> wrote:
> https://bugs.openjdk.org/browse/JDK-8215017 removed the only use of `StringUTF16::equals`. At the time I did some performance verification focused on x86 showing that simplifying and only using `StringLatin1::equals` was either neutral or a win.
>
> I repeated this experiment recently, adding some focused tests on aarch64 where the code generation actually tries to take advantage and generate slightly more efficient code for `StringUTF16::equals`:
> https://github.com/openjdk/jdk/pull/16933#discussion_r1414118658
>
> The indication here is that disabling use of `StringUTF16::equals` was the right choice: any effect from the low-level optimization (one less branch at the tail end) was offset by the `isLatin1()` branch and added code generation (that all gets inlined).
>
> In a `-XX:-CompactStrings` configuration the slightly improved code generation in `StringUTF16::equals` might help, since the `isLatin1()` test and subsequent call to `StringLatin1::equals` would be DCEd. To get the best of both worlds the code in `String::equals` _could_ be sharpened so that we statically pick the best implementation based on `CompactStrings` mode (see comment below). This shows a tiny win (up to -0.2ns/op per `String::equals` on M1; netural on x86). But is all this complexity worth it for a gain that will get lost in the noise on anything realistic?
>
> This PR instead proposes removing `StringUTF16::equals` and simplifying the mechanisms to support the `StringLatin1/UTF16::equals` pair of intrinsics in hotspot.
For reference these are the microbenchmarks used in the JDK-8215017 verification experiment:
diff --git a/test/micro/org/openjdk/bench/java/lang/StringEquals.java b/test/micro/org/openjdk/bench/java/lang/StringEquals.java
index b0db6a7037e..effe855c228 100644
--- a/test/micro/org/openjdk/bench/java/lang/StringEquals.java
+++ b/test/micro/org/openjdk/bench/java/lang/StringEquals.java
@@ -43,6 +43,9 @@ public class StringEquals {
public String test5 = new String(test4); // equal to test4, but not same
public String test6 = new String("0123456780");
public String test7 = new String("0123\u01FE");
+ public String test8 = new String("12\u01FE");
+ public String test9 = new String("12\u01FF");
+ public String test10 = new String("12\u01FE");
@Benchmark
public boolean different() {
@@ -73,5 +76,15 @@ public boolean differentCoders() {
public boolean equalsUTF16() {
return test5.equals(test4);
}
+
+ @Benchmark
+ public boolean equalsUTF16_3_NE() {
+ return test8.equals(test9);
+ }
+
+ @Benchmark
+ public boolean equalsUTF16_3_EQ() {
+ return test8.equals(test10);
+ }
}
And this is the change to `String` to get it back to pre-JDK-8215017 state:
diff --git a/src/java.base/share/classes/java/lang/String.java b/src/java.base/share/classes/java/lang/String.java
index 5869e086191..18ad0e85d33 100644
--- a/src/java.base/share/classes/java/lang/String.java
+++ b/src/java.base/share/classes/java/lang/String.java
@@ -1900,9 +1900,13 @@ public boolean equals(Object anObject) {
if (this == anObject) {
return true;
}
- return (anObject instanceof String aString)
- && (!COMPACT_STRINGS || this.coder == aString.coder)
- && StringLatin1.equals(value, aString.value);
+ if (anObject instanceof String aString) {
+ if (coder() == aString.coder()) {
+ return isLatin1() ? StringLatin1.equals(value, aString.value)
+ : StringUTF16.equals(value, aString.value);
+ }
+ }
+ return false;
}
/**
M1 experiment with `-XX:-CompactStrings` and a patch to choose the `StringUTF16::equals` if `COMPACT_STRINGS` is false[1]:
Name Cnt Base Error Test Error Unit Change
StringEquals.equalsUTF16_3_EQ 5 1,799 ± 0,008 1,585 ± 0,009 ns/op 1,14x (p = 0,000*)
StringEquals.equalsUTF16_3_NE 5 1,622 ± 0,007 1,541 ± 0,011 ns/op 1,05x (p = 0,000*)
* = significant
[1]
diff --git a/src/java.base/share/classes/java/lang/String.java b/src/java.base/share/classes/java/lang/String.java
index 5869e086191..10451bda83f 100644
--- a/src/java.base/share/classes/java/lang/String.java
+++ b/src/java.base/share/classes/java/lang/String.java
@@ -1900,9 +1900,14 @@ public boolean equals(Object anObject) {
if (this == anObject) {
return true;
}
- return (anObject instanceof String aString)
- && (!COMPACT_STRINGS || this.coder == aString.coder)
- && StringLatin1.equals(value, aString.value);
+ if (anObject instanceof String aString) {
+ if (COMPACT_STRINGS) {
+ return this.coder == aString.coder && StringLatin1.equals(value, aString.value);
+ } else {
+ return StringUTF16.equals(value, aString.value);
+ }
+ }
+ return false;
}
/**
As expected this shows a tiny but measurable win on non-x86 platforms for `-CompactString` when retaining `StringUTF16::equals` and selecting it statically. For `+CompactStrings` this is performance neutral with the baseline.
-------------
PR Comment: https://git.openjdk.org/jdk/pull/16995#issuecomment-1845096784
PR Comment: https://git.openjdk.org/jdk/pull/16995#issuecomment-1845110919
More information about the core-libs-dev
mailing list