Is there a possibility of the string equality operator (==) being fixed?

Brian Goetz brian.goetz at oracle.com
Thu Oct 26 15:52:39 UTC 2023


Part of what you are saying is that "if we had a time machine, we should 
have reconsidered whether `String` should have been a primitive in the 
language." And I agree, String is pretty special, that might have been a 
worthwhile conversation to have (in fact, I'm sure pretty it was had.)  
But you're leaping from there to "of course it should have been done 
that way", and then from there to "the only reason we are not fixing it 
now is a misguided, slavish adherence to compatibility."  Both of these 
leaps are wrong.

Let me give you an idea of the cost of what you are suggesting. Your 
"solution" B want to create a new type, `string`, that is better than 
the old String.  Let's take for sake of argument that it really will be 
better.  That's the benefit.  But let's look at the costs:

  - Every single Java API ever written uses String as an argument type, 
a return type, a field type, a type parameter, etc.  So under this plan, 
now 100% of the Java code out there instantly becomes "the old code", 
and either needs to migrate, or clients will stumble over converting 
between `String` and `string`. This is a tax that will literally hit 
almost every line of Java code ever written.

  - Even if a migration like this could be pulled off, how long do you 
think it will take to get to a world where we don't have both "old 
string" and "new string" simultaneously?  Now users will have to learn 
*both ways* and keep track of their subtle differences.

Having seen many, many proposals for improving the language, I can say 
with confidence that the most dangerous source of such proposals is the 
desire to "fix" "mistakes".  The cost of "fixing" those "mistakes" often 
exceeds the benefit by orders of magnitude.

"Solution" A is problematic in a different way.  You are arguing that 
String is so special that it alone should have different == semantics 
than every other class.  Now, I agree that String is pretty special, but 
such "solutions" have a cost too; they add ad-hoc, arbitrary complexity 
to the language, which every user has to learn.  Now users have to learn 
`==` means one thing for primitives, another for object references, but 
a third weird new thing for strings.  Which further makes it more likely 
that they will misuse `==` on other object references, because that's 
how String works!  (And do you believe for a minute that if we did this 
to String, there wouldn't be calls to do the same for, say, 
LocalDateTime?)  While the situation we have is not perfect, it at least 
has simple, stable, principled, easy-to-understand rules.  Trading that 
for complex, ad-hoc, ever-changing, hard-to-keep-track-of rules would 
require a benefit many orders of magnitude bigger.




On 10/26/2023 4:19 AM, tzengshinfu wrote:
> Hello, folks:
>
> I've noticed that `string comparison` in Java can be confusing 
> because, in certain contexts,
> their results are actually the same as `the wrapper classes of 
> Primitive Data Types`, as shown below:
> ```java
>     // wrapper class of int
>     // PS:The constructor Integer(int) is deprecated since version 9
>     Integer int1 = new Integer(1);
>     Integer int2 = new Integer(1);
>     out.println(int1 == int2); // false
>     out.println(int1.equals(int2)); // true
>
>     String string1 = new String("1");
>     String string2 = new String("1");
>     out.println(string1 == string2); // false
>     out.println(string1.equals(string2)); // true
> ```
>
> After modifying the initialization and performing `+` operations,
> the Integer results match expectations, but the String results are 
> unexpected:
> ```java
>     Integer int1 = 1;
>     Integer int2 = 1;
>     Integer int3 = int1 + 1;
>     Integer int4 = int2 + 1;
>     out.println(int3 == int4); // true
>     out.println(int3.equals(int4)); // true
>
>     String string1 = "1";
>     String string2 = "1";
>     String string3 = string1 + "1";
>     String string4 = string2 + "1";
>     out.println(string3 == string4); // Expected result is `true`, but 
> the actual result is `false`.
>     out.println(string3.equals(string4)); // true
> ```
>
> But it's not actually a mistake.
> Based on the naming convention, String is indeed a class, not a 
> Primitive Data Type.
> Because it's a class, using `Object::equals()` to compare its contents 
> is perfectly normal.
>
> However, from a user's perspective, as one of the commonly used 
> functionalities,
> the logic for comparing Strings is different from other Primitive Data 
> Types.
> This invisibly increases the learning curve for students, newcomers,
> and developers transitioning from other programming languages.
> It should be more user-friendly, especially for the aforementioned 
> members.
>
> If there were a way to unify the comparison of strings with the 
> comparison of other Primitive Data Types,
> it would help newcomers use it correctly and enter the Java world more 
> rapidly.
> It would also lead to cleaner code and reduced typing for all developers.
>
> I believe there should be a way to improve this inconsistency.
> Personally, ever since I learned about `String::equals()`, I stopped 
> using `==`,
> which led me to propose two solutions.
>
> Solution A: Make `String1 == String2` have the same result as 
> `String1.equals(String2)`.
>
> I find Solution A to be simpler, but making abrupt changes might cause 
> certain systems or software to break.
> How many disruptions would changing the result of `String1 == String2` 
> cause?
> If it's rarely used or never used, we could safely remove it, but we 
> need to measure the cost of this disruption.
> The challenge is how to measure it.
>
> For all the APIs distributed in the JDK and third-party packages, 
> there are common APIs,
> but there are also less commonly used ones. We, as individual 
> developers, can't know this data.
> We can only voice our opinions on mailing lists, Reddit, or GitHub and 
> feel the response either upward or downward.
> Is there something like "telemetry"?
> Could we possibly collect statistics on various 
> package/class/method/syntax during compilation and return those 
> statistics after compilation?
> Or could we send surveys to Java developers and authors of major 
> packages/libraries?
>
> The fear that "I think it's rarely used, but in reality, it's not, and 
> everything breaks" holds people back.
> Additionally, another reason for hesitation is that there isn't a 
> feature currently in place that scans and safely converts old projects 
> to use updated syntax/APIs during JDK upgrades.
>
> So, what about Solution B?
>
> Solution B: Introduce a new class named `string/str` and deprecate the 
> String class.
>
> We could create a new class, named `string` or `str` (following the 
> convention of starting with a lowercase letter for Primitive Data Types).
> This class's `string comparison` behavior would be the same as 
> Primitive Data Types, which means the result would be the same as 
> `String::equals()`.
> Since it mimics a Primitive Data Type class, we would remove 
> `string/str::equals()`.
>
> Then, we could mark the existing String class as "deprecated" and 
> instruct users to use `string/str` instead.
>
> Unfortunately, `string/str` is not a reserved word, and we don't even 
> know where it might have been used as a variable name or how many times.
> Fortunately, we have a precedent "JEP 443: Unnamed Patterns and 
> Variables (Preview)" that used a lengthy preview period and 
> compile-time warnings to encourage users to abandon reserved words ( _ ).
> I believe the same approach could work for `string/str`.
>
> I personally prefer Solution B because the safest approach would be to 
> prioritize creating the 'new' to replace the 'old' and then, after a 
> period,
> confirm that the usage of the 'old' has fallen below a certain 
> threshold before removing it.
>
> However, as Solution A mentioned, the current situation is that I 
> don't have good data to support either modifying or removing the 'old' 
> action.
> Similarly, I don't have good data to support whether creating the 
> 'new' is necessary.
>
> I agree that "backward compatibility" is a cornerstone of Java's, and 
> any language's, economic system's growth.
> Thanks to this persistence, we can confidently open projects from many 
> years ago in front of our bosses without worrying about them breaking.
> Golang has also maintained its ecosystem thanks to "backward 
> compatibility",
> although it retains the right to break it under certain circumstances 
> (https://go.dev/doc/go1compat).
>
> However, because Java has a long history, to remain competitive, it 
> will make various changes to its specifications.
> In my opinion, "backward compatibility" at such times should not mean 
> maintaining an unchanged status quo but giving developers ample time 
> to adapt to changes, at least when change is possible.
>
> Changing the syntax of `string comparison` would, I believe, make Java 
> more user-friendly, change the perception of Java as a language with a 
> lot of ancient baggage,
> and reduce the cognitive burden on developers.
> But do others share this belief, or is it just me?
>
> Finally, thanks to David's reminder, Brian mentioned, "there's a 
> possibility in the future that we can use .equals() for state 
> comparison on any Primitive Data Types (since everything can be 
> treated as an Object). My perspective is to propose a hypothesis: 
> since String/string and other Primitive Data Types are commonly used, 
> can we simulate them to behave like a single Primitive Data Type in 
> order to coordinate their behavior? Another reason is that I 
> personally find that a == b is clearer than a.equals(b) and less prone 
> to typos (although it's not a problem with the assistance of modern 
> IDEs). From a visual perspective, Andrew also brought up an 
> interesting point, using String1 .= String2 as syntactic sugar for 
> .equals() is much clearer (or maybe we can use String1 === String2?).
>
>
> /* GET BETTER EVERY DAY */
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-dev/attachments/20231026/80c92adb/attachment-0001.htm>


More information about the amber-dev mailing list