Is there a possibility of the string equality operator (==) being fixed?

Thu Oct 26 19:06:51 UTC 2023

Making == work like equals() for String has concurrent algorithm
implications.  Many times when writing a concurrent algorithm, I need
identity == and not equals().  So, if == becomes equals() for String, then
I would need a new operator that works like ==.  We would then have to
introduce a new operator and we are back to having something that works
like == but isn't ==.  In other words, fixing == means adding a new
operator and Java becomes very convoluted.

On Thu, Oct 26, 2023, 11:27 AM Brian Goetz <brian.goetz at oracle.com> wrote:

> Part of what you are saying is that "if we had a time machine, we should
> have reconsidered whether `String` should have been a primitive in the
> language."  And I agree, String is pretty special, that might have been a
> worthwhile conversation to have (in fact, I'm sure pretty it was had.)  But
> you're leaping from there to "of course it should have been done that way",
> and then from there to "the only reason we are not fixing it now is a
> misguided, slavish adherence to compatibility."  Both of these leaps are
> wrong.
>
> Let me give you an idea of the cost of what you are suggesting.  Your
> "solution" B want to create a new type, `string`, that is better than the
> old String.  Let's take for sake of argument that it really will be
> better.  That's the benefit.  But let's look at the costs:
>
>  - Every single Java API ever written uses String as an argument type, a
> return type, a field type, a type parameter, etc.  So under this plan, now
> 100% of the Java code out there instantly becomes "the old code", and
> either needs to migrate, or clients will stumble over converting between
> `String` and `string`.  This is a tax that will literally hit almost every
> line of Java code ever written.
>
>  - Even if a migration like this could be pulled off, how long do you
> think it will take to get to a world where we don't have both "old string"
> and "new string" simultaneously?  Now users will have to learn *both ways*
> and keep track of their subtle differences.
>
> Having seen many, many proposals for improving the language, I can say
> with confidence that the most dangerous source of such proposals is the
> desire to "fix" "mistakes".  The cost of "fixing" those "mistakes" often
> exceeds the benefit by orders of magnitude.
>
> "Solution" A is problematic in a different way.  You are arguing that
> String is so special that it alone should have different == semantics than
> every other class.  Now, I agree that String is pretty special, but such
> "solutions" have a cost too; they add ad-hoc, arbitrary complexity to the
> language, which every user has to learn.  Now users have to learn `==`
> means one thing for primitives, another for object references, but a third
> weird new thing for strings.  Which further makes it more likely that they
> will misuse `==` on other object references, because that's how String
> works!  (And do you believe for a minute that if we did this to String,
> there wouldn't be calls to do the same for, say, LocalDateTime?)  While the
> situation we have is not perfect, it at least has simple, stable,
> principled, easy-to-understand rules.  Trading that for complex, ad-hoc,
> ever-changing, hard-to-keep-track-of rules would require a benefit many
> orders of magnitude bigger.
>
>
>
>
> On 10/26/2023 4:19 AM, tzengshinfu wrote:
>
> Hello, folks:
>
> I've noticed that `string comparison` in Java can be confusing because, in
> certain contexts,
> their results are actually the same as `the wrapper classes of Primitive
> Data Types`, as shown below:
> ```java
>     // wrapper class of int
>     // PS:The constructor Integer(int) is deprecated since version 9
>     Integer int1 = new Integer(1);
>     Integer int2 = new Integer(1);
>     out.println(int1 == int2); // false
>     out.println(int1.equals(int2)); // true
>
>     String string1 = new String("1");
>     String string2 = new String("1");
>     out.println(string1 == string2); // false
>     out.println(string1.equals(string2)); // true
> ```
>
> After modifying the initialization and performing `+` operations,
> the Integer results match expectations, but the String results are
> unexpected:
> ```java
>     Integer int1 = 1;
>     Integer int2 = 1;
>     Integer int3 = int1 + 1;
>     Integer int4 = int2 + 1;
>     out.println(int3 == int4); // true
>     out.println(int3.equals(int4)); // true
>
>     String string1 = "1";
>     String string2 = "1";
>     String string3 = string1 + "1";
>     String string4 = string2 + "1";
>     out.println(string3 == string4); // Expected result is `true`, but the
> actual result is `false`.
>     out.println(string3.equals(string4)); // true
> ```
>
> But it's not actually a mistake.
> Based on the naming convention, String is indeed a class, not a Primitive
> Data Type.
> Because it's a class, using `Object::equals()` to compare its contents is
> perfectly normal.
>
> However, from a user's perspective, as one of the commonly used
> functionalities,
> the logic for comparing Strings is different from other Primitive Data
> Types.
> This invisibly increases the learning curve for students, newcomers,
> and developers transitioning from other programming languages.
> It should be more user-friendly, especially for the aforementioned members.
>
> If there were a way to unify the comparison of strings with the comparison
> of other Primitive Data Types,
> it would help newcomers use it correctly and enter the Java world more
> rapidly.
> It would also lead to cleaner code and reduced typing for all developers.
>
> I believe there should be a way to improve this inconsistency.
> Personally, ever since I learned about `String::equals()`, I stopped using
> `==`,
> which led me to propose two solutions.
>
> Solution A: Make `String1 == String2` have the same result as
> `String1.equals(String2)`.
>
> I find Solution A to be simpler, but making abrupt changes might cause
> certain systems or software to break.
> How many disruptions would changing the result of `String1 == String2`
> cause?
> If it's rarely used or never used, we could safely remove it, but we need
> to measure the cost of this disruption.
> The challenge is how to measure it.
>
> For all the APIs distributed in the JDK and third-party packages, there
> are common APIs,
> but there are also less commonly used ones. We, as individual developers,
> can't know this data.
> We can only voice our opinions on mailing lists, Reddit, or GitHub and
> feel the response either upward or downward.
> Is there something like "telemetry"?
> Could we possibly collect statistics on various
> package/class/method/syntax during compilation and return those statistics
> after compilation?
> Or could we send surveys to Java developers and authors of major
> packages/libraries?
>
> The fear that "I think it's rarely used, but in reality, it's not, and
> everything breaks" holds people back.
> Additionally, another reason for hesitation is that there isn't a feature
> currently in place that scans and safely converts old projects to use
> updated syntax/APIs during JDK upgrades.
>
> So, what about Solution B?
>
> Solution B: Introduce a new class named `string/str` and deprecate the
> String class.
>
> We could create a new class, named `string` or `str` (following the
> convention of starting with a lowercase letter for Primitive Data Types).
> This class's `string comparison` behavior would be the same as Primitive
> Data Types, which means the result would be the same as `String::equals()`.
> Since it mimics a Primitive Data Type class, we would remove
> `string/str::equals()`.
>
> Then, we could mark the existing String class as "deprecated" and instruct
> users to use `string/str` instead.
>
> Unfortunately, `string/str` is not a reserved word, and we don't even know
> where it might have been used as a variable name or how many times.
> Fortunately, we have a precedent "JEP 443: Unnamed Patterns and Variables
> (Preview)" that used a lengthy preview period and compile-time warnings to
> encourage users to abandon reserved words ( _ ).
> I believe the same approach could work for `string/str`.
>
> I personally prefer Solution B because the safest approach would be to
> prioritize creating the 'new' to replace the 'old' and then, after a period,
> confirm that the usage of the 'old' has fallen below a certain threshold
> before removing it.
>
> However, as Solution A mentioned, the current situation is that I don't
> have good data to support either modifying or removing the 'old' action.
> Similarly, I don't have good data to support whether creating the 'new' is
> necessary.
>
> I agree that "backward compatibility" is a cornerstone of Java's, and any
> language's, economic system's growth.
> Thanks to this persistence, we can confidently open projects from many
> years ago in front of our bosses without worrying about them breaking.
> Golang has also maintained its ecosystem thanks to "backward
> compatibility",
> although it retains the right to break it under certain circumstances (
> https://go.dev/doc/go1compat).
>
> However, because Java has a long history, to remain competitive, it will
> make various changes to its specifications.
> In my opinion, "backward compatibility" at such times should not mean
> maintaining an unchanged status quo but giving developers ample time to
> adapt to changes, at least when change is possible.
>
> Changing the syntax of `string comparison` would, I believe, make Java
> more user-friendly, change the perception of Java as a language with a lot
> of ancient baggage,
> and reduce the cognitive burden on developers.
> But do others share this belief, or is it just me?
>
> Finally, thanks to David's reminder, Brian mentioned, "there's a
> possibility in the future that we can use .equals() for state comparison on
> any Primitive Data Types (since everything can be treated as an Object). My
> perspective is to propose a hypothesis: since String/string and other
> Primitive Data Types are commonly used, can we simulate them to behave like
> a single Primitive Data Type in order to coordinate their behavior? Another
> reason is that I personally find that a == b is clearer than a.equals(b)
> and less prone to typos (although it's not a problem with the assistance of
> modern IDEs). From a visual perspective, Andrew also brought up an
> interesting point, using String1 .= String2 as syntactic sugar for
> .equals() is much clearer (or maybe we can use String1 === String2?).
>
>
> /* GET BETTER EVERY DAY */
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-dev/attachments/20231026/0180af6d/attachment-0001.htm>