Missing trivial optimization in String equals
Martin Grajcar
maaartinus at gmail.com
Wed Jan 24 01:48:49 UTC 2018
Hi!
On Tue, Jan 23, 2018 at 1:13 AM, Claes Redestad <claes.redestad at oracle.com>
wrote:
> Hi,
>
> do you have an example application that is spending significant time
> comparing the contents of such identical arrays and would benefit?
no, it was just a random idea.
I would
> guess such comparisons are about twice as fast as typical ones due to
> having to load half as much from memory. ;-)
>
Actually, no data has to be loaded if the arrays are the same object.
It's simple O(1) test, which may or may not save an O(n) test.
> If there's a valid optimization case, though, then it would only be
> realizable when UseStringDeduplication is enabled. Thus it's crucial not
> to regress all the other cases that won't benefit (Parallel, Serial, ...),
> and as String::equals is often a very hot method then even a few added
> instructions that only compare data we've already loaded into registers
> might be too much. Or maybe not.
>
Sure, without deduplication switched on, the test is completely pointless.
Otherwise, the more deduplication gets done, the more useful is may get.
As String::equals is routinely replaced by a C2 intrinsic[1] that's
> where we should look to optimize (not so much in the JDK library code).
>
Thank you, I knew, it happens, but didn't know where.
However, it's the @HotSpotIntrinsicCandidate methods in StringLatin1
and StringUTF16, what gets intrinsified,
but I'd suggest to put the test just before the coder comparison at line
http://hg.openjdk.java.net/jdk10/jdk10/jdk/file/777356696811/src/java.base/
share/classes/java/lang/String.java#l1020
in order to maximize the possible gain.
There's no reason to test the coder when the arrays happen to be the same
object
as this is only possible due to the deduplication, which tests it already.
> I think it'd be straightforward to train that intrinsic to only emit the
> test when the related feature is enabled. Maybe the kind folks over at
> hotspot-compiler-dev at openjdk.java.net might be able to give some pointers.
> If so I'd invite you to set up a set of experiments and see if this can
> have a benefit.
>
Because of the intrinsic being tied to the concrete methods rather than to
the code inside,
I'd have to modify and compile the JVM sources, which needs way more time
than I can spend on it in the moment. :(
A benchmark written using a copy of the method would be too unrealistic,
as the copied comparison uses no intrinsic and is much slower (
https://stackoverflow.com/q/15586997/581205).
Anyway, we know, that an arbitrary high speed up is possible as the O(n)
comparison gets eliminated.
A benchmark showing this can surely be constructed,
but what has to be measured is the tiny slowdown in cases when it doesn't
help
and that needs to re-run existing benchmarks.
Actually, I'd propose something more fancy (added two more independent
trivial optimization ideas):
public boolean equals(Object anObject) {
if (this == anObject) {
return true;
}
if (anObject instanceof String) {
String aString = (String)anObject;
// IDEA 1 - as discussed above
// USE_STRING_DEDUPLICATION is a constant to be eliminated by
JIT
// just like it happens with COMPACT_STRINGS [2].
if (USE_STRING_DEDUPLICATION && value == aString.value) {
// We know that the coders are the same here.
return true;
}
// IDEA 2 - add a trivial test possibly avoiding the loop
// Testing the length first because of it being more probable
to differ.
if (value.length == aString.value.length && coder() ==
aString.coder()) {
// IDEA 3 - remove a needless test of coder()
// Not testing the coder as all we need is byte array
content equality.
// StringLatin1.equals is applicable for UTF-8, too.
return StringLatin1.equals(value, aString.value);
}
}
return false;
}
Regards,
Martin.
[2]:
http://hg.openjdk.java.net/jdk10/jdk10/jdk/file/777356696811/src/java.base/share/classes/java/lang/String.java#l197
Regards
>
> /Claes
>
> [1]
> http://hg.openjdk.java.net/jdk/jdk/file/e1876e6b57b6/src/hot
> spot/share/opto/library_call.cpp#l1116
>
>
>
> On 2018-01-23 00:14, Martin Grajcar wrote:
>
>> It looks like when testing equality of Strings, there's no test if the two
>> arrays are actually the same object. This was impossible before Java 8u20
>> (as no two strings used to share arrays), but it's well possible with
>> -XX:+UseStringDeduplication.
>>
>> Maybe someone here cares...
>>
>> Regards,
>> Martin.
>>
>> http://hg.openjdk.java.net/jdk10/jdk10/jdk/file/777356696811
>> /src/java.base/
>> share/classes/java/lang/String.java#l1014
>> http://hg.openjdk.java.net/jdk10/jdk10/jdk/file/777356696811
>> /src/java.base/
>> share/classes/java/lang/StringLatin1.java#l89
>> http://hg.openjdk.java.net/jdk10/jdk10/jdk/file/777356696811
>> /src/java.base/
>> share/classes/java/lang/StringUTF16.java#l258
>>
>
>
More information about the jdk-dev
mailing list