Is there a possibility of the string equality operator (==) being fixed?
Andrew Myers
acm22 at cornell.edu
Tue Oct 24 02:43:51 UTC 2023
Pedagogically, it would be nice to have some sugar for .equals() that
makes it remotely close in visual appeal to ==. Ideally it would be =
but that is already ruined. My best alternative suggestion would be
e1 .= e2
Cheers,
-- Andrew
On 10/23/23 10:37 AM, Brian Goetz wrote:
> One of the pleasant side-effects of Project Valhalla is that (a) we'll
> be able to say .equals() on primitives, and (b) the cost of doing so
> will JIT down to that of ==. Which means that we can tell people
> "just use .equals() everywhere" (except when implementing low-level
> code like IdentityHashMap) and they will never have to wonder which to
> use. I realize this doesn't solve the "wrong op got the good name"
> problem, but it gives us a path to not having to think about it so often.
>
> It may be possible to migrate String to a value class at some point in
> the distant future (though this has a considerable shroud of
> uncertainty surrounding it), which makes this problem recede farther
> into the background for the particular case of String. Which might be
> enough to make this significantly less of a problem, for the reasons
> you outline here.
>
>
>
> On 10/23/2023 5:32 AM, Andrew Dinn wrote:
>> Hi Brian,
>>
>> I think there is also another subtle confusion lying behind this
>> question. Strings are in the unusual position that they can be named
>> via program literal text e.g. The 7 character sequence "hello" in a
>> program body is a literal reference to an instance of java.lang.String.
>>
>> That is not the case for any other class of object bar one, the
>> exception being instances of java.lang.Class e.g. the 22 character
>> sequence java.lang.String.class in a program body serves as a literal
>> reference to an instance of java.lang.Class.
>>
>> It is easy for a novice programmer to draw the conclusion that this
>> literal reference must exist 1-1 with regard to its corresponding
>> literal i.e. that there will only ever be one String whose ordered
>> sequence of characters will be 'h', 'e', ''l', 'l' and 'o'. The fact
>> that
>>
>> new String("hello") == "hello"
>>
>> will evaluate to false is not immediately evident to beginner
>> programmers.
>>
>> This misnomer is helped along by the fact that the JVM ensures that
>> all occurrences of a String literal in disparate class files do end
>> up referring to the same String instance. If method m of class C
>> passes the literal String "hello" to method m2 of class C2 and the
>> latter compares its input to the literal String "hello" using an
>> equality comparison then the result will be true.
>>
>> class C
>> {
>> . . .
>> void m() {
>> C2.m2("foo");
>> }
>> }
>>
>> class C2
>> {
>> . . .
>> static void m2(String s) {
>> if (s == "foo") {
>> System.out.println("identity equal");
>> }
>> }
>> }
>>
>> The message printout will always be triggered. i.e. Strings mentioned
>> as program literals in source code and thereby introduced as String
>> constants in bytecode are *deduplicated* to the same String instance
>> when the bytecode is loaded by the JVM.
>>
>> That is why it takes some work to arrive at a case like the first
>> code snippet above where two Strings can have equal state but not
>> equal identity. At least one of the String instances has to be
>> explicitly created at runtime via new, substring() or some other
>> method that synthesises a String object.
>>
>> It is interesting to compare this situation with that for class
>> literals where deduplication is either not required or would be
>> incorrect. It requires a much greater feat of ingenuity (equally,
>> carelessness or recklessness), involving the use of
>> application-defined class loaders, to arrive at a situation where the
>> program literal org.my.Foo.class occurring in a method m of class C
>> can identify a different instance of java.lang.Class to the same
>> program literal occurring in a method m2 of class C2. Yet it is
>> possible:
>>
>> class C
>> {
>> . . .
>> void m() {
>> C2.m2(org.my.Foo.class);
>> }
>> }
>>
>> class C2
>> {
>> . . .
>> static void m2(Class<?> c) {
>> if (s != org.my.Foo.class) {
>> // Yes, if you misbehave you can end up here!
>> System.out.println("you are in classloader hell!");
>> }
>> }
>> }
>>
>> I guess I could offer instructions as to how to arrive at the
>> situation where m2 prints out its warning message but I'll leave that
>> as an exercise for the expert (or unwary) reader.
>>
>> regards,
>>
>>
>> Andrew Dinn
>> -----------
>>
>> On 22/10/2023 22:29, Brian Goetz wrote:
>>> First of all, the question is framed in a way that assumes its own
>>> conclusion; that somehow there is something "broken" to be "fixed".
>>> The == operator on object references asks a simple, well-defined,
>>> fundamental question: do these two object references _refer to the
>>> same object_. There is a similar, related question of "do these two
>>> objects _encode the same domain value_" (which is inherently
>>> class-specific), and that goes by the name of the "equals" method.
>>> These are two different questions, and it is important to be able to
>>> ask each. One does not replace the other.
>>>
>>> The presumption that something is "broken" comes from the subjective
>>> perception that the "less important" operation got the "better"
>>> name. Indeed, without a clear understanding of what these two
>>> questions are, it is easy to make mistakes. The comparison to C#
>>> illustrates that other languages could make other choices, which
>>> might result in a different category of mistakes that users might or
>>> might not make.
>>>
>>> While the answer you got said "backward compatibility", this is a
>>> too-simplistic (though often repeated) answer; the answer really is
>>> "because this exactly is how the language was designed to work",
>>> which means this is not something to be "fixed". If we agreed that
>>> this original intention was wrong-headed, then the issue of
>>> compatibility would come in -- that there are billions of lines of
>>> code that have been written in Java, and turning Java into Java++,
>>> whether "better" or not, would break many of them. (Sometimes
>>> language do make incompatible changes because something is so
>>> egregiously broken that it is better to break half the world's code
>>> than continue living with it, but the bar for this is extremely
>>> high, and "I wish the other operation got the good name" doesn't
>>> come near it.)
>>>
>>> But the eye-rolling of "how much are we going to sacrifice at the
>>> altar of backward compatibility" is misplaced. The == operator on
>>> object references still has a clearly defined meaning, and it is the
>>> intended meaning. It may be unfortunate that the "good" name was
>>> taken by the "less common" operation, but programming languages are
>>> full of such things, and one can easily identify such things in each
>>> of the other 19 languages you list. Ultimately, when there are two
>>> ways to do something (such as identity comparison and state
>>> comparison), someone has to choose which one gets which name, and
>>> sometimes someone doesn't agree with that choice.
>>>
>>> In the future, when Project Valhalla delivers value types, which are
>>> classes whose instances have no object identity, the == operator
>>> will compare these objects by their state, not their identity (since
>>> they have none.) But even this would not obviate the need for
>>> Object::equals, since there are many classes that are suitable to be
>>> value types (such as Rational) where multiple distinct
>>> representations (e.g., 1/2 and 2/4) are mathematically equal. So
>>> even there, we need different ways to spell "same object" and
>>> "equivalent value".
>>>
>>> In the farther future, if Java ever has operator overloading, one
>>> might be able to overload `==`, but being able to do that brings its
>>> own set of problems and confusions.
>>>
>>> Which is to say, there really are two questions here, "same object"
>>> and "domain equivalence", and you need ways to ask both.
>>>
>>>
>>>
>>>
>>> On 10/22/2023 3:29 PM, David Alayachew wrote:
>>>> Hello,
>>>>
>>>> Thank you for reaching out!
>>>>
>>>> I'm pretty sure that the amber-dev mailing list is not the correct
>>>> place for this type of question. This topic usually goes on at the
>>>> following mailing list instead. I've CC'd it for you. I would also
>>>> encourage you to remove amber-dev from your CC when responding to
>>>> me, or anyone else on this thread.
>>>>
>>>> discuss at openjdk.org
>>>>
>>>> To answer your question, this is a very common request, and the
>>>> biggest answer is definitely still the backwards compatibility
>>>> problem. But tbh, the question I have for you is this -- is it such
>>>> a big cost to call the o1.equals(o2) method instead of using ==?
>>>> And if you want to handle nulls too, you can import
>>>> java.util.Objects (that class is full of useful static utility
>>>> methods) and then just say Objects.equals(o1, o2) instead. I am
>>>> pretty sure that that exact method was created in response to your
>>>> exact question.
>>>>
>>>> I understand it might be inconvenient, but making a change like you
>>>> suggested would be very disruptive for very little benefit. All you
>>>> would gain from doing this would be a slightly better syntax for
>>>> representing object equality and a little more ease when it comes
>>>> to teaching somebody Java. Is that really worth the effort?
>>>>
>>>> As for the class-file api, I'll CC them so that someone can fact
>>>> check me. Assuming I'm not wrong (no one responds to that point
>>>> specifically), I would also drop that mailing list from your CC
>>>> when responding.
>>>>
>>>> The purpose of the Class-File API was to build and transform class
>>>> files. So that seems unrelated to what you want. You want to
>>>> repurpose old syntax, but syntax stops being relevant after
>>>> compilation, and it is these compiled class files that the
>>>> Class-File API deals in. If we tried to use that API to handle
>>>> class files created with the old syntax, then we would have a
>>>> migration and clarity problem, amongst much more.
>>>>
>>>> Let us know if you have any more questions.
>>>>
>>>> Thank you for your time!
>>>> David Alayachew
>>>>
>>>>
>>>> On Sun, Oct 22, 2023 at 2:12 PM tzengshinfu <tzengshinfu at gmail.com>
>>>> wrote:
>>>>
>>>> Hi, folks:
>>>>
>>>> When I switched my primary programming language from C# to Java, I
>>>> found myself perplexed by 'string comparison' (and still do at
>>>> times). While string comparisons can sometimes become quite
>>>> intricate, involving issues like case sensitivity, cultural
>>>> nuances... most of the time, all that's needed is string1 ==
>>>> string2.
>>>>
>>>> I discovered that a similar question was asked a decade ago
>>>> (https://urldefense.com/v3/__https://www.reddit.com/r/java/comments/1gjwpu/will_the_equals_operator_ever_be_fixed_with/__;!!ACWV5N9M2RV99hQ!NMbrc-pVC7Fix0fznwtzWbOW7c0MPb0ip-0s0pQQTbroMgFLJHOYeM2Ivmn0M7z-TdVpjJXT-JW6WDo$
>>>> ), with responses indicating that it's due to 'Backward
>>>> compatibility,' and therefore, unlikely to change. (Backward
>>>> compatibility! We just keep piling new things on top of historical
>>>> baggage, and for users coming from school or from other languages
>>>> like C#, Python, C++, Rust, Golang, Kotlin, Scala, JavaScript, PHP,
>>>> Rlang, Swift, Ruby, Dart... the top 20 languages according to PYPL,
>>>> having to consult the so-called 'Java FAQ' can be frustrating.
>>>>
>>>> But I believe that if something is amiss, it should be corrected
>>>> to keep moving forward. It would be fantastic if this issue could
>>>> be addressed in a new version of Java and an automatic conversion
>>>> feature provided to fix places in user code that use
>>>> String.equals. (Similar to the JVM's preview feature switch) Is
>>>> the Class-File API a potential solution to this problem? Is my
>>>> idea unrealistic?
>>>>
>>>> /* GET BETTER EVERY DAY */
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/discuss/attachments/20231023/5da6f257/attachment-0001.htm>
More information about the discuss
mailing list