PROPOSAL: Equivalence operators (formerly titled "Sameness operators") (Version 3)

Derek Foster vapor1 at teleport.com
Fri May 8 01:47:17 PDT 2009


Discussion on the Project Coin mailing list has suggested that the previous version of this proposal's use of the $ character has enough potential to break code (albeit in rare cases, and only in generated code) that it is likely to not be considered feasible for this purpose. The ~ character (also briefly mentioned in the previous version of the proposal) has similar problems. Therefore, this proposal has been altered to use the # character instead, as this does not (as far as I am aware) introduce any potential for code breakage, with @ as a suggested alternative should # not be considered acceptable. Some discussion of how this use of # might impact other JDK 8 proposals (closure proposals, etc.) which might also wish to use the # character has also been added.


Equivalence Operators (previously called "Sameness operators") (Version 3)

AUTHOR:

Derek Foster

OVERVIEW

Many Java objects implement the Comparable<T> interface, and/or override the Object.equals() method, to provide a way of ordering their instances or detecting whether two instances are equivalent . However, the syntax for using these methods has historically been fairly ugly:

// We want to write "if a >= b", but we have to write...
if (a.compareTo(B) >= 0) {
     whatever();
}

// We want to write "if a == b", but we have to write...
if (a == null ? b == null : a.equals(b)) {
     whatever();
}

The ugliness of these methods has often motivated the creation of special-purpose APIs to simplify the use of classes which implement them, such as the 'Date.before(Date)' and 'Date.after(Date)' methods in java.util.Date. However, these methods are inconsistent from class to class, and often not present.

Furthermore, the existing language == and != operators exhibit a strange assymetry between the behavior of objects and that of primitive types that often catches new users of Java by surprise:

int i;
if (i == 5) {
    // gets here if the VALUE of i is 5
}

String j = new String("abc");
if (j == "abc") {
    // probably never gets here! Comparing the values of the references, not the values of the strings.
}

String k = new String("abc");
if (k.equals("abc")) {
     // gets here if the VALUE of k is "abc".
}

This behavior often confuses newcomers to Java, and is a common source of bugs even for experienced programmers.

This proposal suggests that a new set of relational operators be added to Java which would simplify ordering classes by their declared orderings (as specified by the Comparable<T> class and Object.equals()), while still yielding syntax that is as simple as using the existing >=, <=, ==, and != operators.

FEATURE SUMMARY:

Adds four new operators to Java: 

a ## b    "equivalent to":              a==null ? b==null : a.equals(b), or a == b for primitive types.
a !# b    "not equivalent to":          a==null ? b!=null : !a.equals(b), or a != b for primitive types.
a ># b    "greater than or equivalent": a.compareTo(b) >= 0, or a >= b for primitive types.
a <# b    "less than or equivalent":    a.compareTo(b) <= 0, or a <= b for primitive types.

and adds additional overloadings to existing operators:
a < b     a.compareTo(b) < 0, or a < b for primitive types.
a > b     a.compareTo(b) > 0, or a > b for primitive types.

Note that this proposal specifies alternatives to the specific operator names chosen. For instance, the proposal could be implemented using @@, !@, >@, and <@ instead of ##, !#, >#, and <#) if the use of '#' is deemed problematic due to possible interference with other uses by future proposals Java language changes. This is discussed in the "ALTERNATIVES" section below.

MAJOR ADVANTAGE:

Use of new operators for these relational tests would simplify code and make it more clear what relational tests are being made, as well as reducing the opportunity for mistakes (such as accidentally typing a.compareTo(b) >=0 when <=0 was intended).

MAJOR BENEFIT:

Clearer code due to the use of infix operators instead of using method calls and an extra pair of parentheses, plus possible extra tests (often accidentally omitted) for nullness around calls to Object.equals(Object).

Future Java language change proposals involving limited uses of operator overloading for code clarity (for BigInteger, complex number, etc. classes) would no longer run into the iceberg of "but you can't change the behavior of == in a backwards-compatible fashion!"


MAJOR DISADVANTAGE:

Modifications to the compiler would be required. These do not appear particularly difficult, but would take some effort.


ALTERNATIVES:

Keep using the Comparable.compareTo and Object.equals methods as they are.

It would have been better to define == and != this way in the first place, and use some other operator (perhaps === as in some other languages?) to indicate comparison by object identity. This would have made Java simpler and easier for newcomers to understand. However, that's not how the language is currently defined, and to change the behavior of == now would be a backwards incompatible change.

It might be possible to define ## and !# in terms of "Comparator<T>.compareTo(T)" instead of in terms of Object.equals(Object). However, doing so would have been less general purpose (it would only have worked on comparable classes). Although it is possible for the equals and compareTo methods of an object to disagree (for instance, Float's handling of Float.NaN), in practice this rarely occurs in the scenarios for which compareTo is typically used for determining well-ordering (sorting algorithms, etc.).

The default set of operators (##, !#, >#, and <#) suggested in this proposal were chosen because the "#" is somewhat similar to an equals sign, and hence has an easy mnemonic meaning of equivalence (as opposed to identity, which is represented by "="). However, the '#' character is highly sought after by writers of language change proposals (such as closures, method pointers, and others), and so is under heavy competition as to its future meaning. Some care might be required to ensure that use of the # character by this proposal would not interfere with some other future desired meaning of the # character. Fortunately, most existing proposals for using the "#" in code have attempted to use it as an infix operator with identifiers on either side rather than as a prefix operator, which reduces the risk of conflicting with this proposal. Also, if the above operators were defined as tokens before any such code exists, it should be easy to ensure that any future use of the # character simply never falls into the pattern of one of these tokens. For instance, the ">#" token is only even potentially ambiguous if there is some meaning assigned to the prefix operator # such that comparisons are legal. For instance, if a closure proposal would make "if (#foo>#bar)" into legal code (defining #foo and #bar in such a way that whatever they represented was meaningful to compare with the ">" operator), then extra whitespace might be needed in some cases to ensure that this wasn't parsed as "if (#foo ># bar)".

Alternately, this proposal could be altered to use another character instead of # for its tokens. One good candidate for this would be "@", which is also unused as an operator in Java, but which unfortunately lacks any obvious (to the author of this proposal, anyway) mnemonic to associate it with equality or ordering tests.


EXAMPLES

SIMPLE EXAMPLE:

String getOrdering(String first, String second) {
    if (first ## second) {
        System.out.println("They are equal");
    } else if (first !# second) {
        System.out.println("They are not equal");
    } else if (first > second) {
        System.out.println("The first is after the second");
    } else if (first ># second) {
        System.out.println("The first is same as or after the second");
    } else if (first < second) {
        System.out.println("The first is before the second");
    } else if (first <# second) {
        System.out.println("The first is before or the same as the second");
    }
}

ADVANCED EXAMPLE:

Really, the simple example pretty much illustrates the feature.


DETAILS

SPECIFICATION:


The following new tokens 

    ## !# ># <#

shall be added to section 3.12 of the JLS3.

The expression grammar in section 15.20 shall be modified like so:

RelationalExpression:
        ShiftExpression
        RelationalExpression < ShiftExpression
        RelationalExpression > ShiftExpression
        RelationalExpression <= ShiftExpression
        RelationalExpression >= ShiftExpression
        RelationalExpression ># ShiftExpression
        RelationalExpression <# ShiftExpression
        RelationalExpression instanceof ReferenceType

The expression grammar in section 15.21 shall be modified like so:

    EqualityExpression:
            RelationalExpression
            EqualityExpression == RelationalExpression
            EqualityExpression != RelationalExpression
            EqualityExpression ## RelationalExpression
            EqualityExpression !# RelationalExpression

Semantically, the behavior of these new operators is as follows:


## and !# operators:

Evaluation of these operators shall occur exactly as they do for the == and != operators, as specified in section 15.21 of the JLS3 ("Equality Operators"), with the exception that for the purposes of these operators, section 15.21.3 ("Reference equality operators == and !=") shall be disregarded and replaced with:

If the operands of a sameness operator are both of either reference type or the null type, then the operation is object equivalence. The behavior described below is for the ## operator. The !# operator shall behave identically except that it shall return false when the ## operator would return true, and vice versa. The procedure for evaluating the ## operator is as follows:

If both operands are null, then the result shall be 'true'.

Otherwise, if the left operand is null and the right operand is not null, then the result shall be 'false'.

Otherwise, the return value shall be the value of the expression "left.equals(right)", evaluated using the java.lang.Object.equals(Object) method.



>, >#, <, and <# operators:

Evaluation of these operators shall occur exactly as it does for corresponding >, >=, <, and <= operators, in section 15.20.1 of the JLS3 ("Numerical Comparison Operators <, <=, >, and >=") with the exception of the following.

The text "The type of each of the operands of a numerical comparison operator must be a type that is convertible (§5.1.8) to a primitive numeric type, or a compile-time error occurs." shall be replaced with the text "If the type of both of the operands of a numerical comparison operator is a type that is convertible (§5.1.8) to a primitive numeric type, the following algorithm is used to evaluate the operator. Otherwise, the operation is object ordering (See §15.20.1.1)"

A new section 15.20.1.1 shall be added consisting of the following text:

If one or both operands of a relational operator cannot be converted to primitive numeric types, then boxing conversions shall be used to convert both operands to object types.

After this conversion, if the type of the left operand does not extend the raw type java.lang.Comparable, and also does not extend java.lang.Comparable<T> for some type T which is equal to or a supertype of the type of the right operand, then a compiler error shall be reported.

Otherwise, the result of the comparison shall be evaluated at runtime as follows:

If the left operand is null, a NullPointerException shall be thrown.

Otherwise, the method left.compareTo(right) shall be called. The following table shall be used to determine the result of the operator evaluation, based on the value returned from this method:

         left.compareTo(right)
          <0      ==0     >0
    >     false   false  true
    <     true    false  false
    >#    false   true   true
    <#    true    true   false


COMPILATION: 

Compilation of the example given above would be desugared as follows:

    if (first==null ? second==null : first.equals(second)) {
        System.out.println("They are equal");
    } else if (first==null ? second != null : !first.equals(second)) {
        System.out.println("They are not equal");
    } else if (first.compareTo(second) > 0) {
        System.out.println("The first is after the second");
    } else if (first.compareTo(second) >= 0) {
        System.out.println("The first is same as or after the second");
    } else if (first.compareTo(second) < 0) {
        System.out.println("The first is before the second");
    } else if (first.compareTo(second) <= 0) {
        System.out.println("The first is before or the same as the second");
    }


TESTING:

The feature can be tested by ensuring, first of all that the new operators return the same results as existing operators when invoked on operands which are both convertible to numeric primitive types.

Secondly, that the new operators return the same results as their desugaring equivalents when invoked in circumstances where one or both operands are not convertible to numeric primitive types.

Thirdly, that the relational operators throw NullPointerExceptions when the leftmost operation is null.

Fourthly, that compiler errors are reported exactly in the cases when the desugared equivalents of the operators would not compile.

LIBRARY SUPPORT:

No library changes are needed for this feature.

REFLECTIVE APIS:

No changes to reflective APIs are needed for this feature.

OTHER CHANGES:

No other parts of the platform need to be updated.

MIGRATION:

A tool such as Eclipse or IntelliJ Idea might identify for the users existing calls to .compareTo or .equals which could be converted to use the new operators, and could offer to perform the change automatically at the user's request. Or a user could simply refactor code to use the new operators as desired.

COMPATIBILITY

BREAKING CHANGES:

The current version of this proposal should not break any previously valid code.

EXISTING PROGRAMS:

This change should not create incompatibilities with existing source or class files.

REFERENCES

EXISTING BUGS:

There are a variety of proposals in the Bug Database related to various people's desires to support operator overloading for certain built-in mathematical classes such as BigInteger. A couple of these proposals are listed below.

"Add [], -, +, *, /  operators to core classes as appropriate" (related)
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=5099780

"BigInteger should support autoboxing"
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6407464

Also, see discussion on the Project Coin mailing list regarding the proposal "Draft proposal: allow the use of relational operators on Comparable classes," particularly with regards to why that proposal was withdrawn (namely, inability to make the == and != operators work properly on Comparable classes).
http://mail.openjdk.java.net/pipermail/coin-dev/2009-March/000361.html


URL FOR PROTOTYPE (optional):

None.




More information about the coin-dev mailing list