<i18n dev> RFR: 8311188: Simplify and modernize equals and hashCode in java.text [v4]
Pavel Rappo
prappo at openjdk.org
Thu Jul 6 16:19:55 UTC 2023
On Thu, 6 Jul 2023 14:46:59 GMT, Pavel Rappo <prappo at openjdk.org> wrote:
>> I'd suggest replacing the calls to `valuesMatch` with `Objects.equals` and remove the `valuesMatch` method as unnecessary.
>
>> I'd suggest replacing the calls to `valuesMatch` with `Objects.equals` and remove the `valuesMatch` method as unnecessary.
>
> I'll do something about them soon, Roger. But first I need to understand JDK-8015417 better, as it also affects other similar PRs already in review or in the works.
> I wrote a little case study on `Objects::equals` that talks about how it should optimize, when it does, why it doesn’t, and how (maybe) to fix that.
>
> https://cr.openjdk.org/~jrose/jvm/equals-profile.html https://cr.openjdk.org/~jrose/jvm/equals-profile.md
>
> This is also attached to the JBS bug.
>
> The work on [JDK-8026251](https://bugs.openjdk.org/browse/JDK-8026251) with the `TypeProfileLevel` switch bring us closer to correctly optimizing `Objects::equals` in more cases. Sadly, JDK refactoring by itself will not get all the way to where we want to go. The JVM’s profiling logic needs tweaking.
This PR has turned into an impromptu discussion on JDK-8015417. I think it's reasonable to continue it here, rather than to fork a separate discussion, because of (i) the relevance, (ii) context, and (iii) the fact that I have already pointed someone to this thread elsewhere.
Below is my mental model which I built based on your excellent write-up, John; is this model correct?
1. Inside a static method of a utility class, such as `java.util.Objects`, `java.util.Arrays`, and `jdk.internal.util.ArraysSupport`, that have one or more parameters of `java.lang.Object` or arrays thereof, "observation of a static type" fails because `java.lang.Object` is the most inexact type possible.
2. Generics that might be available in the context of a call site of the said method do not _currently_ contribute to "observation of a static type" inside the said method:
T obj = ...
return Objects.hashCode(obj);
It would still be true inside the method, if it used someone else's or its own generic parameters, as for example, `java.util.Objects.compare` does:
public static <T> int compare(T a, T b, Comparator<? super T> c) {
return (a == b) ? 0 : c.compare(a, b);
}
In the sense of being opposite to "specific", generics are as "generic" as `java.lang.Object` and are not reifiable even for optimization purposes.
3. The above two points conclude the "observation of a static type" part of the issue. Our only hope is now (dynamic) profiling followed by successful type speculation.
4. Hand-inlining the said method is necessary but not sufficient to win the optimization. One way to avoid the need for hand-inlining is to annotate that method with `@ForceInline`.
5. Profiling/speculation leads to desired devirtualization of the call to an instance method, if the reference through which the call is made refers to an instance of an exact type (i.e. a `final` class or a primitive[^*]). Note: the reference itself does not need to be of an exact type. For example, this will do:
Object s = "myString";
...
java.util.Objects.hashCode(s);
4. Inside a general-purpose method, such as `Objects.hashCode(Object)`, profiling is noisy, and thus is unlikely to pan out a successful speculation. Outside such a general-purpose method, at its call site, profiling could be less noisy.
[^*]: It cannot be primitive in this case, because we are only concerned with methods that accept `java.lang.Object`.
----
Let's further test that model by checking its predictions.
1. Consider mainline `java.util.AbstractMap.SimpleEntry`:
public boolean equals(Object o) {
return o instanceof Map.Entry<?, ?> e
&& eq(key, e.getKey())
&& eq(value, e.getValue());
}
/**
* Utility method for SimpleEntry and SimpleImmutableEntry.
* Test for equality, checking for nulls.
*
* NB: Do not replace with Object.equals until JDK-8015417 is resolved.
*/
private static boolean eq(Object o1, Object o2) {
return o1 == null ? o2 == null : o1.equals(o2);
}
Since everything here is either of type `java.lang.Object` or a wildcard, no information is contributed to "observation of a static type". That `eq` method is likely is as bad as `Objects.equals`, assuming our runtime operates on many maps of different parameterizations leading to a noisy profile. Which means that in this case, we can reasonably refactor that `equals` as follows:
public boolean equals(Object o) {
return o instanceof Map.Entry<?, ?> e
&& Objects.equals(key, e.getKey())
&& Objects.equals(value, e.getValue());
}
Similar reasoning could be applied to that same class' `hashCode`, to refactor it from this:
public int hashCode() {
return (key == null ? 0 : key.hashCode()) ^
(value == null ? 0 : value.hashCode());
}
to this:
public int hashCode() {
return (Objects.hashCode(key)) ^
(Objects.hashCode(value));
}
Unlike the previous example, however, where we trade one inlining for another, here we might lose the first layer of inlining. But assuming that `Objects.hashCode` is relatively small and additionally might get annotated with `@ForceInline` as a result of this discussion later on, it will be inlined.
2. Consider an opposite example:
public class ECFieldF2m implements ECField {
...
private BigInteger rp;
...
public int hashCode() {
int value = m << 5;
value += (rp==null? 0:rp.hashCode());
// no need to involve ks here since ks and rp
// should be equivalent.
return value;
}
}
Would it be reasonable to change `hashCode` like this?
public int hashCode() {
int value = m << 5;
value += Objects.hashCode(rp);
// no need to involve ks here since ks and rp
// should be equivalent.
return value;
}
Firstly, we lose inlining. Secondly, the fact that `rp` is `BigInteger` is no longer immediately available to the actual hash code computation. Assuming that `java.util.Objects.hashCode` will be inlined at runtime, and given enough invocations, it will be profiled and subjected to class hierarchy analysis (CHA), which in the absence of `BigInteger.hashCode` overrides will hopefully determine that the call could be further inlined.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/14752#discussion_r1254659913
More information about the i18n-dev
mailing list