Is there a possibility of the string equality operator (==) being fixed?

Thu Oct 26 08:19:57 UTC 2023

Hello, folks:

I've noticed that `string comparison` in Java can be confusing because, in
certain contexts,
their results are actually the same as `the wrapper classes of Primitive
Data Types`, as shown below:
```java
    // wrapper class of int
    // PS:The constructor Integer(int) is deprecated since version 9
    Integer int1 = new Integer(1);
    Integer int2 = new Integer(1);
    out.println(int1 == int2); // false
    out.println(int1.equals(int2)); // true

    String string1 = new String("1");
    String string2 = new String("1");
    out.println(string1 == string2); // false
    out.println(string1.equals(string2)); // true
```

After modifying the initialization and performing `+` operations,
the Integer results match expectations, but the String results are
unexpected:
```java
    Integer int1 = 1;
    Integer int2 = 1;
    Integer int3 = int1 + 1;
    Integer int4 = int2 + 1;
    out.println(int3 == int4); // true
    out.println(int3.equals(int4)); // true

    String string1 = "1";
    String string2 = "1";
    String string3 = string1 + "1";
    String string4 = string2 + "1";
    out.println(string3 == string4); // Expected result is `true`, but the
actual result is `false`.
    out.println(string3.equals(string4)); // true
```

But it's not actually a mistake.
Based on the naming convention, String is indeed a class, not a Primitive
Data Type.
Because it's a class, using `Object::equals()` to compare its contents is
perfectly normal.

However, from a user's perspective, as one of the commonly used
functionalities,
the logic for comparing Strings is different from other Primitive Data
Types.
This invisibly increases the learning curve for students, newcomers,
and developers transitioning from other programming languages.
It should be more user-friendly, especially for the aforementioned members.

If there were a way to unify the comparison of strings with the comparison
of other Primitive Data Types,
it would help newcomers use it correctly and enter the Java world more
rapidly.
It would also lead to cleaner code and reduced typing for all developers.

I believe there should be a way to improve this inconsistency.
Personally, ever since I learned about `String::equals()`, I stopped using
`==`,
which led me to propose two solutions.

Solution A: Make `String1 == String2` have the same result as
`String1.equals(String2)`.

I find Solution A to be simpler, but making abrupt changes might cause
certain systems or software to break.
How many disruptions would changing the result of `String1 == String2`
cause?
If it's rarely used or never used, we could safely remove it, but we need
to measure the cost of this disruption.
The challenge is how to measure it.

For all the APIs distributed in the JDK and third-party packages, there are
common APIs,
but there are also less commonly used ones. We, as individual developers,
can't know this data.
We can only voice our opinions on mailing lists, Reddit, or GitHub and feel
the response either upward or downward.
Is there something like "telemetry"?
Could we possibly collect statistics on various package/class/method/syntax
during compilation and return those statistics after compilation?
Or could we send surveys to Java developers and authors of major
packages/libraries?

The fear that "I think it's rarely used, but in reality, it's not, and
everything breaks" holds people back.
Additionally, another reason for hesitation is that there isn't a feature
currently in place that scans and safely converts old projects to use
updated syntax/APIs during JDK upgrades.

So, what about Solution B?

Solution B: Introduce a new class named `string/str` and deprecate the
String class.

We could create a new class, named `string` or `str` (following the
convention of starting with a lowercase letter for Primitive Data Types).
This class's `string comparison` behavior would be the same as Primitive
Data Types, which means the result would be the same as `String::equals()`.
Since it mimics a Primitive Data Type class, we would remove
`string/str::equals()`.

Then, we could mark the existing String class as "deprecated" and instruct
users to use `string/str` instead.

Unfortunately, `string/str` is not a reserved word, and we don't even know
where it might have been used as a variable name or how many times.
Fortunately, we have a precedent "JEP 443: Unnamed Patterns and Variables
(Preview)" that used a lengthy preview period and compile-time warnings to
encourage users to abandon reserved words ( _ ).
I believe the same approach could work for `string/str`.

I personally prefer Solution B because the safest approach would be to
prioritize creating the 'new' to replace the 'old' and then, after a period,
confirm that the usage of the 'old' has fallen below a certain threshold
before removing it.

However, as Solution A mentioned, the current situation is that I don't
have good data to support either modifying or removing the 'old' action.
Similarly, I don't have good data to support whether creating the 'new' is
necessary.

I agree that "backward compatibility" is a cornerstone of Java's, and any
language's, economic system's growth.
Thanks to this persistence, we can confidently open projects from many
years ago in front of our bosses without worrying about them breaking.
Golang has also maintained its ecosystem thanks to "backward compatibility",
although it retains the right to break it under certain circumstances (
https://go.dev/doc/go1compat).

However, because Java has a long history, to remain competitive, it will
make various changes to its specifications.
In my opinion, "backward compatibility" at such times should not mean
maintaining an unchanged status quo but giving developers ample time to
adapt to changes, at least when change is possible.

Changing the syntax of `string comparison` would, I believe, make Java more
user-friendly, change the perception of Java as a language with a lot of
ancient baggage,
and reduce the cognitive burden on developers.
But do others share this belief, or is it just me?

Finally, thanks to David's reminder, Brian mentioned, "there's a
possibility in the future that we can use .equals() for state comparison on
any Primitive Data Types (since everything can be treated as an Object). My
perspective is to propose a hypothesis: since String/string and other
Primitive Data Types are commonly used, can we simulate them to behave like
a single Primitive Data Type in order to coordinate their behavior? Another
reason is that I personally find that a == b is clearer than a.equals(b)
and less prone to typos (although it's not a problem with the assistance of
modern IDEs). From a visual perspective, Andrew also brought up an
interesting point, using String1 .= String2 as syntactic sugar for
.equals() is much clearer (or maybe we can use String1 === String2?).

/* GET BETTER EVERY DAY */
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-dev/attachments/20231026/ed609b62/attachment-0001.htm>